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To Paul Halmos 
In Memoriam 


Preface 


It seems to have been decided that undergraduate mathematics today rests 
on two foundations: calculus and linear algebra. These may not be the 
best foundations for, say, number theory or combinatorics, but they serve 
quite well for undergraduate analysis and several varieties of undergradu- 
ate algebra and geometry. The really perfect sequel to calculus and linear 
algebra, however, would be a blend of the two — a subject in which calcu- 
lus throws light on linear algebra and vice versa. Look no further! This 
perfect blend of calculus and linear algebra is Lie theory (named to honor 
the Norwegian mathematician Sophus Lie — pronounced “Lee ”). So why 
is Lie theory not a standard undergraduate topic? 

The problem is that, until recently, Lie theory was a subject for mature 
mathematicians or else a tool for chemists and physicists. There was no 
Lie theory for novice mathematicians. Only in the last few years have there 
been serious attempts to write Lie theory books for undergraduates. These 
books broke through to the undergraduate level by making some sensible 
compromises with generality; they stick to matrix groups and mainly to the 
classical ones, such as rotation groups of n-dimensional space. 

In this book I stick to similar subject matter. The classical groups 
are introduced via a study of rotations in two, three, and four dimensions, 
which is also an appropriate place to bring in complex numbers and quater- 
nions. From there it is only a short step to studying rotations in real, 
complex, and quaternion spaces of any dimension. In so doing, one has 
introduced the classical simple Lie groups, in their most geometric form, 
using only basic linear algebra. Then calculus intervenes to find the tan- 
gent spaces of the classical groups — their Lie algebras — and to move back 
and forth between the group and its algebra via the log and exponential 
functions. Again, the basics suffice: single-variable differentiation and the 
Taylor series for e x and log(l +x). 
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Where my book diverges from the others is at the next level, the mirac- 
ulous level where one discovers that the (curved) structure of a Lie group is 
almost completely captured by the structure of its (flat) Lie algebra. At this 
level, the other books retain many traces of the sophisticated approach to 
Lie theory. For example, they rely on deep ideas from outside Lie theory, 
such as the inverse function theorem, existence theorems for ODEs, and 
representation theory. Even inside Lie theory, they depend on the Killing 
form and the whole root system machine to prove simplicity of the classical 
Lie algebras, and they use everything under the sun to prove the Campbell- 
Baker-Hausdorff theorem that lifts structure from the Lie algebra to the Lie 
group. But actually, proving simplicity of the classical Lie algebras can be 
done by basic matrix arithmetic, and there is an amazing elementary proof 
of Campbell-Baker-Flausdorff due to Eichler [1968]. 

The existence of these little-known elementary proofs convinced me 
that a naive approach to Lie theory is possible and desirable. The aim of 
this book is to carry it out — developing the central concepts and results of 
Lie theory by the simplest possible methods, mainly from single-variable 
calculus and linear algebra. Familiarity with elementary group theory is 
also desirable, but I provide a crash course on the basics of group theory in 
Sections 2.1 and 2.2. 

The naive approach to Lie theory is due to von Neumann [1929], and it 
is now possible to streamline it by using standard results of undergraduate 
mathematics, particularly the results of linear algebra. Of course, there is a 
downside to naivete. It is probably not powerful enough to prove some of 
the results for which Lie theory is famous, such as the classification of the 
simple Lie algebras and the discovery of the five exceptional algebras. 1 To 
compensate for this lack of technical power, the end-of-chapter discussions 
introduce important results beyond those proved in the book, as part of an 
informal sketch of Lie theory and its history. It is also true that the naive 
methods do not afford the same insights as more sophisticated methods. 
But they offer another insight that is often undervalued — some important 
theorems are not as difficult as they look! I think that all mathematics 
students appreciate this kind of insight. 

In any case, my approach is not entirely naive. A certain amount of 
topology is essential, even in basic Lie theory, and in Chapter 8 I take 

1 1 say so from painful experience, having entered Lie theory with the aim of under- 
standing the exceptional groups. My opinion now is that the Lie theory that precedes the 
classification is a book in itself. 
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the opportunity to develop all the appropriate concepts from scratch. This 
includes everything from open and closed sets to simple connectedness, so 
the book contains in effect a minicourse on topology, with the rich class 
of multidimensional examples that Lie theory provides. Readers already 
familiar with topology can probably skip this chapter, or simply skim it to 
see how Lie theory influences the subject. (Also, if time does not permit 
covering the whole book, then the end of Chapter 7 is a good place to stop.) 

I am indebted to Wendy Baratta, Simon Goberstein, Brian Hall, Ro- 
han Hewson, Chris Hough, Nathan Jolly, David Kramer, Jonathan Lough, 
Michael Sun, Marc Ryser, Abe Shenitzer, Paul Stanford, Fan Wu and the 
anonymous referees for many corrections and comments. As usual, my 
wife, Elaine, served as first proofreader; my son Robert also served as the 
model for Figure 8.7. Thanks go to Monash University for the opportunity 
to teach courses from which this book has grown, and to the University of 
San Francisco for support while writing it. 

Finally, a word about my title. Readers of a certain age will remember 
the book Naive Set Theory by Paul Halmos — a lean and lively volume 
covering the parts of set theory that all mathematicians ought to know. 
Paul Halmos (1916-2006) was my mentor in mathematical writing, and I 
dedicate this book to his memory. While not attempting to emulate his style 
(which is inimitable), I hope that Naive Lie Theory can serve as a similar 
introduction to Fie groups and Fie algebras. Lie theory today has become 
the subject that all mathematicians ought to know something about, so I 
believe the time has come for a naive, but mathematical, approach. 

John Stillwell 

University of San Francisco, December 2007 
Monash University, February 2008 


Contents 


1 Geometry of complex numbers and quaternions 1 

1 . 1 Rotations of the plane 2 

1 .2 Matrix representation of complex numbers 5 

1.3 Quaternions 7 

1 .4 Consequences of multiplicative absolute value 11 

1.5 Quaternion representation of space rotations 14 

1.6 Discussion 18 

2 Groups 23 

2.1 Crash course on groups 24 

2.2 Crash course on homomorphisms 27 

2.3 The groups SU(2) and SO(3) 32 

2.4 Isometries of M" and reflections 36 

2.5 Rotations of M 3 4 and pairs of quaternions 38 

2.6 Direct products of groups 40 

2.7 The map from SU(2)xSU(2) to SO(4) 42 

2.8 Discussion 45 

3 Generalized rotation groups 48 

3.1 Rotations as orthogonal transformations 49 

3.2 The orthogonal and special orthogonal groups 51 

3.3 The unitary groups 54 

3.4 The symplectic groups 57 

3.5 Maximal tori and centers 60 

3.6 Maximal tori in SO («), U («), SU(«), Sp(n) 62 

3.7 Centers of SO(n), U(n), SU(n), Sp(/i) 67 

3.8 Connectedness and discreteness 69 

3.9 Discussion 71 

xi 


xii Contents 

4 The exponential map 74 

4.1 The exponential map onto S 0(2) 75 

4.2 The exponential map onto SU(2) 77 

4.3 The tangent space of SU(2) 79 

4.4 The Lie algebra su(2) of SU(2) 82 

4.5 The exponential of a square matrix 84 

4.6 The affine group of the line 87 

4.7 Discussion 91 

5 The tangent space 93 

5.1 Tangent vectors of 0(n), U(n), Sp(n) 94 

5.2 The tangent space of SO (n) 96 

5.3 The tangent space of U(n), SU(«), Sp(«) 99 

5.4 Algebraic properties of the tangent space 103 

5.5 Dimension of Lie algebras 106 

5.6 Complexffication 107 

5.7 Quaternion Lie algebras Ill 

5.8 Discussion 113 

6 Structure of Lie algebras 116 

6.1 Normal subgroups and ideals 117 

6.2 Ideals and homomorphisms 120 

6.3 Classical non-simple Lie algebras 122 

6.4 Simplicity of s[(«,C) and su(n) 124 

6.5 Simplicity of so (n) for n > 4 127 

6.6 Simplicity of sp(rc) 133 

6.7 Discussion 137 

7 The matrix logarithm 139 

7.1 Logarithm and exponential 140 

7.2 The exp function on the tangent space 142 

7.3 Limit properties of log and exp 145 

7.4 The log function into the tangent space 147 

7.5 SO(n), SU(«), and Sp (n) revisited 150 

7.6 The Campbell-Baker-Hausdorff theorem 152 

7.7 Eichler’s proof of Campbell-Baker-Hausdorff 154 

7.8 Discussion 158 


Contents 


xiii 

8 Topology 160 

8.1 Open and closed sets in Euclidean space 161 

8.2 Closed matrix groups 164 

8.3 Continuous functions 166 

8.4 Compact sets 169 

8.5 Continuous functions and compactness 171 

8.6 Paths and path-connectedness 173 

8.7 Simple connectedness 177 

8.8 Discussion 182 

9 Simply connected Lie groups 186 

9.1 Three groups with tangent space M 187 

9.2 Three groups with the cross-product Lie algebra 188 

9.3 Lie homomoiphisms 191 

9.4 Uniform continuity of paths and deformations 194 

9.5 Deforming a path in a sequence of small steps 195 

9.6 Lifting a Lie algebra homomorphism 197 

9.7 Discussion 201 

Bibliography 204 

207 


Index 


1 


Geometry of complex 
numbers and quaternions 


Preview 

When the plane is viewed as the plane C of complex numbers, rotation 
about O through angle 6 is the same as multiplication by the number 

e' e = cos 0 + /sin 6. 

The set of all such numbers is the unit circle or 1 -dimensional sphere 

S 1 = {z : \z\ = 1}. 

Thus S 1 is not only a geometric object, but also an algebraic structure', 
in this case a group, under the operation of complex number multiplication. 
Moreover, the multiplication operation e ,e ‘ -e l(>2 = e‘( 01 1 . and the inverse 

operation (e‘ e ) = e l d K depend smoothly on the parameter 0. This 

makes S 1 an example of what we call a Lie group. 

However, in some respects S 1 is too special to be a good illustration of 
Lie theory. The group §’ is 1 -dimensional and commutative, because mul- 
tiplication of complex numbers is commutative. This property of complex 
numbers makes the Lie theory of S 1 trivial in many ways. 

To obtain a more interesting Lie group, we define the four-dimensional 
algebra of quaternions and the three-dimensional sphere 8 3 of unit quater- 
nions. Under quaternion multiplication, § 3 is a noncommutative Lie group 
known as SU(2), closely related to the group of space rotations. 


J. Stillwell, Naive Lie Theory, DOI: 10.1007/978-0-387-78214-0.1, 
© Springer Science+Business Media, LLC 2008 
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1 The geometry of complex numbers and quaternions 


1.1 Rotations of the plane 

A rotation of the plane M 2 about the origin O through angle 6 is a linear 
transformation Rq that sends the basis vectors (1,0) and (0, 1) to (cos 6, 
sin0) and (— sin0,cos0), respectively (Figure 1.1). 



It follows by linearity that Rq sends the general vector 

(x,y) = *(1,0) +y(0, 1) to (xcosQ — ysinO, xsinO +ycos0), 

and that Rq is represented by the matrix 

/ cos 6 — sin 6 

ysinO cos 9 

We also call this matrix Rq. Then applying the rotation to (x,y) is the same 
as multiplying the column vector (y) on the left by matrix Rq, because 

fx\_ /cos 0 — sin (A fx\ _ fxcosd—ysmd\ 

d \yJ ysinO cosO J\y) \xsin6 +ycos6 J ' 

Since we apply matrices from the left, applying R rp then Ro is the same 
as applying the product matrix RqRq. (Admittedly, this matrix happens 
to equal R^Rq because both equal Rq +(P . But when we come to space 
rotations the order of the matrices will be important.) 



1 . 1 Rotations of the plane 
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Thus we can represent the geometric operation of combining succes- 
sive rotations by the algebraic operation of multiplying matrices. The main 
aim of this book is to generalize this idea, that is, to study groups of linear 
transformations by representing them as matrix groups. For the moment 
one can view a matrix group as a set of matrices that includes, along with 
any two members A and B , the matrices AB, A , and B 1 . Later (in Sec- 
tion 7.2) we impose an extra condition that ensures “smoothness” of matrix 
groups, but the precise meaning of smoothness need not be considered yet. 
For those who cannot wait to see a definition, we give one in the subsection 
below — but be warned that its meaning will not become completely clear 
until Chapters 7 and 8. 

The matrices Rq, for all angles 0, form a group called the special or- 
thogonal group SO(2). The reason for calling rotations “orthogonal trans- 
formations” will emerge in Chapter 3, where we generalize the idea of 
rotation to the 77-dimensional space W and define a group SO («) for each 
dimension n. In this chapter we are concerned mainly with the groups 
SO(2) and SO(3), which are typical in some ways, but also exceptional 
in having an alternative description in terms of higher-dimensional “num- 
bers.” 

Each rotation Rq of M 2 can be represented by the complex number 
ze = cos 0 + i sin 0 

because if we multiply an arbitrary point (x,y) = x + iy by zq we get 
ze (x + iy ) = (cos 0 + i sin 0 ) {x + iy ) 

= a cos 0 — ysin0 + 7'(xsin0 +ycos 0) 

= ( xcosd —ysind, xsind + ycos0), 

which is the result of rotating (x,y) through angle 0. Moreover, the ordi- 
nary product zoZy represents the result of combining Rq and R (p . 

Rotations of M 3 and M 4 can be represented, in a slightly more compli- 
cated way, by four-dimensional “numbers” called quaternions. We intro- 
duce quaternions in Section 1.3 via certain 2x2 complex matrices, and to 
pave the way for them we first investigate the relation between complex 
numbers and 2x2 real matrices in Section 1.2. 

What is a Lie group? 

The most general definition of a Lie group G is a group that is also a smooth 
manifold. That is, the group “product” and “inverse” operations are smooth 
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1 The geometry of complex numbers and quaternions 


functions on the manifold G. For readers not familiar with groups we give 
a crash course in Section 2.1, but we are not going to define smooth mani- 
folds in this book, because we are not going to study general Lie groups. 

Instead we are going to study matrix Lie groups, which include most 
of the interesting Lie groups but are much easier to handle. A matrix 
Lie group is a set of n x n matrices (for some fixed n) that is closed un- 
der products, inverses, and nonsingular limits. The third closure condition 
means that if Ai,A 2 ,A$,. . . is a convergent sequence of matrices in G, and 
A = lini/ f ,^Ak has an inverse, then A is in G. We say more about the limit 

concept for matrices in Section 4.5, but for n x n real matrices it is just the 

2 

limit concept in M” . 

We can view all matrix Lie groups as groups of real matrices, but it is 
natural to allow the matrix entries to be complex numbers or quaternions 
as well. Real entries suffice in principle because complex numbers and 
quaternions can themselves be represented by real matrices (see Sections 
1.2 and 1.3). 

It is perhaps surprising that closure under nonsingular limits is equiv- 
alent to smoothness for matrix groups. Since we avoid the general con- 
cept of smoothness, we cannot fully explain why closed matrix groups are 
“smooth” in the technical sense. However, in Chapter 7 we will construct 
a tangent space T\ (G) for any matrix Lie group G from tangent vectors 
to smooth paths in G. We find the tangent vectors using only elementary 
single-variable calculus, and it can also be shown that the space 7) (G) has 
the same dimension as G. Thus G is “smooth” in the sense that it has a 
tangent space, of the appropriate dimension, at each point. 

Exercises 

Since rotation through angle 0 + (p is the result of rotating through 0 , then rotating 
through (p, we can derive formulas for sin(0 + <pj and cos(0 + (p) in terms of sin 0, 
sintp, cos 0, and cos <p. 

1.1.1 Explain, by interpreting z.q \ (p in two different ways, why 

cos(0 + (p) + isin(Q + <p) = (cos 0 + /sin0)(cos<p + / sin <p) . 
Deduce that 


sin(0 + cp) = sin 0 cos cp + cos 0 sintp, 
cos(0 + <p) = cos 0 cos (p — sin 0 sintp. 


1 . 1.2 Deduce formulas for sin 20 and cos 20 from the formulas in Exercise 1.1.1. 


1 .2 Matrix representation of complex numbers 
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1.1.3 Also deduce, from Exercise 1.1.1, that 


tan(0 + (p) 


tan 9 + tan (p 
1 — tan 9 tan (p 


1.1.4 Using Exercise 1.1.3, or otherwise, write down the formula for tan(0 — cp), 
and deduce that lines through O at angles 9 and (p are perpendicular if and 
only if tan 9 = — 1 / tan <p. 

1.1.5 Write down the complex number z~g and the inverse of the matrix for rota- 
tion through 9, and verify that they correspond. 


1.2 Matrix representation of complex numbers 

A good way to see why the matrices Rq = (“’ n S g c( s ) ' n ( ® ) behave the same 
as the complex numbers ze = cos 6 + /sin 6 is to write Rq as the linear 
combination 

=COS0 (q 

of the basis matrices 



It is easily checked that 

l 2 = 1, li = il = i, i 2 = — 1, 

so the matrices 1 and i behave exactly the same as the complex numbers 1 
and i. 

In fact, the matrices 

^=o , l + M, where a,b € E, 

behave exactly the same as the complex numbers a T- bi under addition 
and multiplication, so we can represent all complex numbers by 2 x 2 real 
matrices, not just the complex numbers zq that represent rotations. This 
representation offers a “linear algebra explanation” of certain properties of 
complex numbers, for example: 

• The squared absolute value, \a + bi\ 2 = a 2 + b 2 of the complex num- 
ber a + bi is the determinant of the corresponding matrix 
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• Therefore, the multiplicative property of absolute value, |ziZ 2 1 
|zi[[z 2 |, follows from the multiplicative property of determinants, 

det(AiA 2 ) = det(Ai)det(A 2 ). 

(Take A\ as the matrix representing z\, and A 2 as the 
senting z 2 .) 

• The inverse z~ 1 = jpypi of z = a + bi 7 ^ 0 coiTesponds 
matrix 

( a ~b\ 1 1 / a b\ 

a ) a 2 + b 2 \—b a)' 

The two-square identity 

If we set zi = a\ + ib\ and 7,2 = a 2 + ib 2 , then the multiplicative property 
of (squared) absolute value states that 

{a\ + b\){a\ + b\) = (aia 2 -bib 2 ) 2 + (aifi 2 + a 2 fii) 2 , 

as can be checked by working out the product z\Zi and its squared abso- 
lute value. This identity is particularly interesting in the case of integers 
ai,bi,a 2 ,b 2 , because it says that 

(a sum of two squares) x (a sum of two squares) = (a sum of two squares) . 

This fact was noticed nearly 2000 years ago by Diophantus, who men- 
tioned an instance of it in Book III, Problem 19, of his Arithmetica. How- 
ever, Diophantus said nothing about sums of three squares — with good rea- 
son, because there is no such three-square identity. For example 

(1 2 + 1 2 + 1 2 )(0 2 + l 2 + 2 2 ) = 3x5= 15, 

and 15 is not a sum of three integer squares. 

This is an early warning sign that there are no three-dimensional num- 
bers. In fact, there are no //-dimensional numbers for any n > 2; however, 
there is a “near miss” for n = 4. One can define “addition” and “multipli- 
cation” for quadruples q = (a.b.c. d) of real numbers so as to satisfy all 
the basic laws of arithmetic except q\q 2 = c] 2 d\ (the commutative law of 
multiplication). This system of arithmetic for quadruples is the quaternion 
algebra that we introduce in the next section. 


matrix repre- 

to the inverse 


1.3 Quaternions 
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Exercises 

1 . 2.1 Derive the two-square identity from the multiplicative property of det. 

1 . 2.2 Write 5 and 13 as sums of two squares, and hence express 65 as a sum of 
two squares using the two-square identity. 

1 . 2.3 Using the two-square identity, express 37 2 and 37 4 as sums of two nonzero 


squares. 


The absolute value |z| = \/a 2 + /?- represents the distance of z from O, and 
more generally, | u — v| represents the distance between u and v. When combined 
with the distributive Jaw, 


m(v — w) = uv — uw : 


a geometric property of multiplication comes to light. 

1 . 2.4 Deduce, from the distributive law and multiplicative absolute value, that 


I MV— MW | = |m| |v — w |. 


Explain why this says that multiplication of the whole plane of complex 
numbers by u multiplies all distances by \u\. 

1 . 2.5 Deduce from Exercise 1.2.4 that multiplication of the whole plane of com- 
plex numbers by cos 6 + f sin 0 leaves all distances unchanged. 

A map that leaves all distances unchanged is called an isometry (from the 
Greek for “same measure”), so multiplication by cos 6 + i sin 0 is an isometry of 
the plane. (In Section 1 . 1 we defined the corresponding rotation map Rq as a linear 
map that moves 1 and i in a certain way; it is not obvious from this definition that 
a rotation is an isometry.) 

1.3 Quaternions 

By associating the ordered pair (a, b) with the complex number a+ib or the 
matrix J’ j we can speak of the “sum,” “product,” and “absolute value” 
of ordered pairs. In the same way, we can speak of the “sum,” “product,” 
and “absolute value” of ordered quadruples by associating each ordered 
quadruple ( a,b,c,d ) of real numbers with the matrix 



(*) 


We call any matrix of the form (*) a quaternion. (This is not the only 
way to associate a matrix with a quadruple. I have chosen these complex 
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matrices because they extend the real matrices used in the previous sec- 
tion to represent complex numbers. Thus complex numbers are the special 
quaternions with c = d = 0 .) 

It is clear that the sum of any two matrices of the form (*) is another 
matrix of the same form, and it can be checked (Exercise 1 . 3 . 2 ) that the 
product of two matrices of the form (*) is of the form (*). Thus we can 
define the sum and product of quaternions to be just the matrix sum and 
product. Also, if the squared absolute value \q \ 2 of a quaternion q is de- 
fined to be the determinant of q , then we have 


det q = det 


fa + id 
\b — ic 


—b — ic\ 
a — id J 


a 2 + b 2 + c 2 + d 2 . 


So \q \ 2 is the squared distance of the point (a,b,c,d) from O in M 4 . 

The quaternion sum operation has the same basic properties as addition 
for numbers, namely 


* 7 i +*72 — qi + qi, 

< 7 i + fe> + <? 3 ) = (< 7 i +<72) + 43 , 

q + (— q ) = 0 where 0 is the zero matrix, 
<7 + 0 = q. 


(commutative law) 
(associative law) 
(inverse law) 
(identity law) 


The quaternion product operation does not have all the properties of 
multiplication of numbers — in general, the commutative property <71*72 = 
<72*71 fails — but well-known properties of the matrix product imply the fol- 
lowing properties of the quaternion product: 

til (*72*73) = (<7 i* 72)<73, (associative law) 

qq~ l =1 for <7 / 0 . (inverse law) 

<71 = q, (identity law) 

*7i(*?2 + *73) = *7i*?2 + *7i*? 3- (left distributive law) 

Here 0 and 1 denote the 2 x 2 zero and identity matrices, which are also 
quaternions. The right distributive law (<72 + *73)*?! = *72*71 +*73 *?i of course 
holds too, and is distinct from the left distributive law because of the non- 
commutative product. 

The noncommutative nature of the quaternion product is exposed more 
clearly when we write 


a + di —b — ci 
b — ci a — di 


al + bi + cj + <r/k, 
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where 



Thus 1 behaves like the number 1, i 2 = — 1 as before, and also j 2 = k 2 = 
— 1. The noncommutativity is concentrated in the products of i, j, k, which 
are summarized in Figure 1.2. The product of any two distinct elements is 



k j 



Figure 1.2: Products of the imaginary quaternion units. 


the third element in the circle, with a + sign if an arrow points from the 
first element to the second, and a — sign otherwise. For example, ij = k, 
but ji = — k, soij/ji. 

The failure of the commutative law is actually a good thing, because it 
enables quaternions to represent other things that do not commute, such as 
rotations in three and four dimensions. 

As with complex numbers, there is a linear algebra explanation of some 
less obvious properties of quaternion multiplication. 


• The absolute value has the multiplicative property \q\q2\ = \ c i\ \\qi\, 
by the multiplicative property of det: det(zy 1 c/ 2 J = det(gi) det(^ 2 )- 

• Each nonzero quaternion q has an inverse q 1 , namely the matrix 
inverse of q. 

• From the matrix (*) for q we get an explicit formula for q 1 . If 
q = a\ + bi + cj + dk / 0 then 




1 


1 

a 2 + b 2 + c 2 + d 2 


(a 1 — bi 


cj — t/k) . 


The quaternion al — bi — cj — dk is called the quaternion conjugate 
qofq = crl + M + cj + dk, and we have qq = a 2 +b 2 +c 2 +d 2 = \q\ 2 . 
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• The quaternion conjugate is not the result of taking the complex con- 
jugate of each entry in the matrix q. In fact, q is the result of taking 
the complex conjugate of each entry in the transposed matrix q T . 
Then it follows from (qiqiY = q^qj that {q\qj) = ~qj~if\- 

The algebra of quaternions was discovered by Hamilton in 1843, and 
it is denoted by H in his honor. He started with just i and j (hoping to 
find an algebra of triples analogous to the complex algebra of pairs), but 
later introduced k = ij to escape from apparently intractable problems with 
triples (he did not know, at first, that there is no three-square identity). The 
matrix representation was discovered in 1858, by Cayley. 

The 3-sphere of unit quaternions 

The quaternions al + bi + cj + d k of absolute value 1, or unit quaternions, 
satisfy the equation 

a 2 + b 2 + c 2 + d 2 = 1 . 

Hence they form the analogue of the sphere, called the 3-sphere § 3 , in the 
space M 4 of all 4-tuples ( a,b,c,d ). It follows from the multiplicative prop- 
erty and the formula for inverses above that the product of unit quaternions 
is again a unit quaternion, and hence 8 3 is a group under quaternion mul- 
tiplication. Like the 1 -sphere S 1 of unit complex numbers, the 3-sphere 
of unit quaternions encapsulates a group of rotations, though not quite so 
directly. In the next two sections we show how unit quaternions may be 
used to represent rotations of ordinary space M 3 . 


Exercises 


When Hamilton discovered H he described quaternion multiplication very con- 
cisely by the relations 

i 2 =j 2 = k 2 = ijk=-l. 

1.3.1 Verify that Hamilton’s relations hold for the matrices 1, i, j, and k. Also 
show (assuming associativity and inverses) that these relations imply all 
the products of i, j, and k shown in Figure 1.2. 

1.3.2 Verify that the product of quaternions is indeed a quaternion. (Hint: It helps 
to write each quaternion in the form 


q = 




where a = x — iy is the complex conjugate of a = x + iy.) 


1 .4 Consequences of multiplicative absolute value 
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1 . 3.3 Check that q is the result of taking the complex conjugate of each entry in 
q T , and hence show that q \ q 2 qfqf for any quaternions q\ and q 2 . 

1 . 3.4 Also check that qq= \q\ 2 . 

Cayley’s matrix representation makes it easy (in principle) to derive an amaz- 
ing algebraic identity. 

1 . 3.5 Show that the multiplicative property of determinants gives the complex 
two-square identity (discovered by Gauss around 1820) 

(|a 1 | 2 +|j3i| 2 )(|a2| 2 +|j3 2 | 2 ) = |aia 2 -/3ij3 2 | 2 + |ai/3 2 + /3iaT| 2 . 

1 . 3.6 Show that the multiplicative property of determinants gives the real four- 
square identity 

(uj b | T c j ~t - dfji^a^ -\- b-t T Co T dn ) = (a\a 2 — b\b 2 — c\c 2 — d\di^) 

+ (aib 2 + bia 2 + c\d 2 - d\c 2 ) 2 
+ («ic 2 - b\d 2 + c\a 2 + ^i^ 2 ) 2 
+ (a\d 2 + b\C 2 - c\b 2 + diaf) 2 ■ 

This identity was discovered by Euler in 1748, nearly 100 years before the dis- 
covery of quaternions! Like Diophantus, he was interested in the case of integer 
squares, in which case the identity says that 

(a sum of four squares) x (a sum of four squares) = (a sum of four squares). 

This was the first step toward proving the theorem that every positive integer is 
the sum of four integer squares. The proof was completed by Lagrange in 1770. 

1 . 3.7 Express 97 and 99 as sums of four squares. 

1 . 3.8 Using Exercise 1.3.6, or otherwise, express 97 x 99 as a sum of four squares. 


1.4 Consequences of multiplicative absolute value 

The multiplicative absolute value, for both complex numbers and quater- 
nions, first appeared in number theory as a property of sums of squares. It 
was noticed only later that it has geometric implications, relating multipli- 
cation to rigid motions of M 2 , M 3 , and M 4 . Suppose first that u is a complex 
number of absolute value 1. Without any computation with cos 6 and sin 6, 
we can see that multiplication of C = M 2 by u is a rotation of the plane as 
follows. 
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Let v and w be any two complex numbers, and consider their images, 
uv and uw under multiplication by u. Then we have 

distance from uv to uw = \ uv — uw \ 

= \u(v — w) | by the distributive law 
= H|v — w| by multiplicative absolute value 
= |v — w| because \u\ = 1 
= distance from v to w. 

In other words, multiplication by u with \u\ = 1 is a rigid motion, also 
known as an isometry, of the plane. Moreover, this isometry leaves O 
fixed, because ux 0 = 0. And if u 1 , no other point v is fixed, because 
uv = v implies u = 1. The only motion of the plane with these properties 
is rotation about O. 

Exactly the same argument applies to quaternion multiplication, at least 
as far as preservation of distance is concerned: if we multiply the space 
M 4 of quaternions by a quaternion of absolute value 1, then the result is 
an isometry of M 4 that leaves the origin fixed. It is in fact reasonable to 
interpret this isometry of M 4 as a “rotation,” but first we want to show that 
quaternion multiplication also gives a way to study rotations of M 3 . To see 
how, we look at a natural three-dimensional subspace of the quaternions. 

Pure imaginary quaternions 

The pure imaginary quaternions are those of the form 

p = bi + cj + t/k. 

They form a three-dimensional space that we will denote by Mi -f Mj + Mk, 
or sometimes M 3 for short. The space Mi + Mj + Mk is the orthogonal 
complement to the line Ml of quaternions of the form al, which we will 
call real quaternions. From now on we write the real quaternion al simply 
as a, and denote the line of real quaternions simply by M. 

It is clear that the sum of any two members of Mi + Mj + Mk is itself 
a member of Mi + Mj + Mk, but this is not generally true of products. In 
fact, if u = u\i + uf] + u 3 k and v = vii + mj + v 2 k then the multiplication 
diagram for i, j, and k (Figure 1.2) gives 

UV = — (u\V 1 + J< 2 V 2 + M3V3) 

+ (w 2 v 3 - n 3 v 2 )i - («1 v 3 - u 3 vi )i + {u l v 2 - u 2 v 1 )k. 


1 .4 Consequences of multiplicative absolute value 
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This relates the quaternion product uv to two other products on M 3 that are 
well known in linear algebra: the inner (or “scalar” or “dot”) product, 


u ■ v = WlVl + W 2 V 2 + W3V3, 


and the vector (or “cross”) product 



i 

j 

k 

U X V = 

U\ 

ll 2 

M3 


Vi 

V2 

V 3 


(m 2 v 3 - n 3 v 2 )i - (wiv 3 - M3V1 )j + {u\V 2 - U 2 v i)k. 


In terms of the scalar and vector products, the quaternion product is 


uv = —u ■ V + U X 1C 


Since u ■ v is a real number, this formula shows that uv is in Mi + Mj + Mk 
only if u ■ v = 0, that is, only if u is orthogonal to v. 

The formula uv = -u-v + uxv also shows that uv is real if and only 
if u x v = 0, that is, if u and v have the same (or opposite ) direction. In 
particular - , ifu 6 Mi + Mj + Mk and \u\ = 1 then 

9 1 1 9 

W = — u ■ u = — \u\ = —1. 

Thus every unit vector in Mi + Mj + Mk is a “square root of — 1 (This, by 
the way, is another sign that H does not satisfy all the usual laws of algebra. 
If it did, the equation u 2 = — 1 would have at most two solutions.) 


Exercises 

The cross product is an operation on Mi +Rj + Rk because u x vis in Mi+Mj + Mk 
for any u, v £ Mi + Mj + Mk. However, it is neither a commutative nor associative 
operation, as Exercises 1.4.1 and 1.4.3 show. 

1 . 4.1 Prove the antisymmetric property u X v = — v X u. 

1 . 4.2 Prove that n x (v x w) = v(m ■ w) — w(u ■ v) for pure imaginary u,v,w. 

1 . 4.3 Deduce from Exercise 1.4.2 that x is not associative. 

1 . 4.4 Also deduce the Jacobi identity for the cross product: 

u x (v X w) + w x (u X v) + V' x (wxu) = 0. 

The antisymmetric and Jacobi properties show that the cross product is not com- 
pletely lawless. These properties define what we later call a Lie algebra. 
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1.5 Quaternion representation of space rotations 

A quaternion t of absolute value 1, like a complex number of absolute value 
1, has a “real part” cos 9 and an “imaginary part” of absolute value sin 6, 
orthogonal to the real part and hence in Mi + Mj + Mk. This means that 

t = cos 6 + Msin0, 

where u is a unit vector in Mi + Mj + Mk, and hence ir = — I by the remark 
at the end of the previous section. 

Such a unit quaternion t induces a rotation of Mi + Mj + Mk, though 
not simply by multiplication, since the product of t and a member q of 
Mi + Mj + Mk may not belong to Mi + Mj + Mk. Instead, we send each 
q G Mi + Mj + Mkto t~ l qt, which turns out to be a member of Mi+Mj + Mk. 
To see why, first note that 

f _1 =t/\t\ 2 = cos 6 — Msin0, 

by the formulas for q 1 and q in Section 1.3. 

Since exists, multiplication of H on either side by f or t _1 is an 
invertible map and hence a bijection of H onto itself. It follows that the 
map q i— > t~ x qt, called conjugation by t, is a bijection of H. Conjugation by 
t also maps the real line M onto itself, because t~ l rt = r for a real number 
r; hence it also maps the orthogonal complement Mi + Mj + Mk onto itself. 
This is because conjugation by t is an isometry, since multiplication on 
either side by a unit quaternion is an isometry. 

It looks as though we are onto something with conjugation by t = 
cos 9 + u sin 9, and indeed we have the following theorem. 

Rotation by conjugation. Ift = cos 9 + u sin 9, where it 6 Mi + Mj + Mk 

is a unit vector, then conjugation by t rotates Mi + Mj + Mk through angle 
29 about axis u. 

Proof. First, observe that the line Mm of real multiples of u is fixed by the 
conjugation map, because 

t~ x ut = (cos 9 — Msin0)w(cos 9 + wsin0) 

= (mcos 0 — M 2 sin0)(cos0 + Msin0) 

= (u cos 9 + sin 9 ) (cos 9 + u sin 9 ) because u 2 = — 1 
= m( cos 2 9 + sin 2 9 ) + sin 9 cos 9 + a 2 sin 9 cos 9 
= u also because vr = — 1 . 


1.5 Quaternion representation of space rotations 
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It follows, since conjugation by t is an isometry of Mi + Mj + Mk, that 
its restriction to the plane through O in Mi + Mj + Mk orthogonal to the line 
Mm is also an isometry. And if the restriction to this plane is a rotation, then 
conjugation by t is a rotation of the whole space Mi + Mj + Mk. 

To see whether this is indeed the case, choose a unit vector v orthogonal 
to u in Mi + Mj + Mk, so u ■ v = 0. Then let w = u x v, which equals uv 
because u ■ v = 0, so {u, v, w} is an orthonormal basis of Mi + Mj + Mk with 
uv = w,vw = u, wu = v, uv = — vu and so on. It remains to show that 

t~ l vt = vcos2 9 — wsin20, t~ l wt = vsin20 + wcos20, 

because this means that conjugation by t rotates the basis vectors v and w, 
and hence the whole plane orthogonal to the line Mm, through angle 29. 
This is confirmed by the following computation: 

t 1 vt = (cos 6 — Msin0)v(cos 6 + nsin0) 

= (vcos 6 — uv sin 6) (cos 6 + Msin0) 

= v cos 2 6 — uv sin 9 cos 6 + vu sin 9 cos 9 — uvu sin 2 9 
= vcos" 9 — 2«vsin 9 cos 9 + u vsin" 9 because vu = —uv 
= v(cos 2 9 — sin 2 9 ) — 2w sin 9 cos 9 because u 2 = — 1 , uv = w 
= vcos 29 — wsin20. 

A similar computation (try it) shows that t~ l wt = vsin 20 + vvcos20, as 
required. □ 

This theorem shows that every rotation of M 3 , given by an axis u and 
angle of rotation a, is the result of conjugation by the unit quaternion 

a . a 
t = cos — b u sin — . 

2 2 

The same rotation is induced by —t, since = t~ x st. But ±t 

are the only unit quaternions that induce this rotation, because each unit 
quaternion is uniquely expressible in the form t = cos j + wsin j, and the 
rotation is uniquely determined by the two (axis, angle) pairs (u,a) and 
( — u, —a). The quaternions t and —t are said to be antipodal, because they 
represent diametrically opposite points on the 3-sphere of unit quaternions. 

Thus the theorem says that rotations of M 3 correspond to antipodal 
pairs of unit quaternions. We also have the following important corollary. 
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Rotations form a group. The product of rotations is a rotation, and the 
inverse of a rotation is a rotation. 

Proof. The inverse of the rotation about axis u through angle a is obviously 
a rotation, namely, the rotation about axis u through angle —a. 

It is not obvious what the product of two rotations is, but we can show 
as follows that it has an axis and angle of rotation, and hence is a rotation. 
Suppose we are given a rotation iq with axis u\ and angle oq, and a rotation 
V2 with axis ui and angle oq- Then 


• ■ , , , . oq . oq 

r i is induced by conjugation by t\ = cos — + ii\ sin — 


and 


i"2 is induced by conjugation by <2 = cos -f + U2 sin — , 
hence the result tqtq of doing iq, then r2, is induced by 
q i-> tf l (tf l qt\)t2 = {tit 2 y { q{td2), 

which is conjugation by t\t2 = t. The quaternion t is also a unit quaternion, 
so 

a . a 
t = cos — b u sin — 

2 2 

for some unit imaginary quaternion u and angle a. Thus the product rota- 
tion is the rotation about axis u through angle a. □ 


The proof shows that the axis and angle of the product rotation tq r2 can 
in principle be found from those of r\ and tq by quaternion multiplication. 
They may also be described geometrically, by the alternative proof of the 
group property given in the exercises below. 

Exercises 

The following exercises introduce a small fragment of the geometry of isometries: 
that any rotation of the plane or space is a product of two reflections. We begin 
with the simplest case: representing rotation of the plane about O through angle 
0 as the product of reflections in two lines through O. 

If Jz? is any line in the plane, then reflection in Jz? is the transformation of the 
plane that sends each point S to the point S' such that SS' is orthogonal to Jz? and 
is equidistant from S and S'. 


1.5 Quaternion representation of space rotations 
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Figure 1.3: Reflection of S and its angle. 


1 . 5.1 If Jz? passes through P, and if S lies on one side of Jz? at angle a (Figure 1.3), 
show that S' lies on the other side of Jz? at angle a, and that \PS\ = PS' \ . 

1 . 5.2 Deduce, from Exercise 1.5.1 or otherwise, that the rotation about P through 
angle 0 is the result of reflections in any two lines through P that meet at 
angle 0/2. 

1 . 5.3 Deduce, from Exercise 1.5.2 or otherwise, that if Jz?, ■///, and jY are lines 
situated as shown in Figure 1 .4, then the result of rotation about P through 
angle 0, followed by rotation about Q through angle <p, is rotation about R 
through angle % (with rotations in the senses indicated by the arrows). 


P 



1 . 5.4 If Jz? and JV are parallel, so R does not exist, what isometry is the result of 
the rotations about P and Q? 

Now we extend these ideas to R 3 . A rotation about a line through O (called 
the axis of rotation) is the product of reflections in planes through O that meet 
along the axis. To make the reflections easier to visualize, we do not draw the 
planes, but only their intersections with the unit sphere (see Figure 1.5). 

These intersections are curves called great circles, and reflection in a great 
circle is the restriction to the sphere of reflection in a plane through O. 
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Figure 1.5: Reflections in great circles on the sphere. 


1 . 5.5 Adapt the argument of Exercise 1.5.3 to great circles Jzf, and jY shown 
in Figure 1.5. What is the conclusion? 

1 . 5.6 Explain why there is no exceptional case analogous to Exercise 1.5.4. De- 
duce that the product of any two rotations of R a about O is another rotation 
about (9, and explain how to find the axis of the product rotation. 

The idea of representing isometries as products of reflections is also useful in 
higher dimensions. We use this idea again in Section 2.4, where we show that any 
isometry of K" that fixes O is the product of at most n reflections in hyperplanes 
through O. 


1.6 Discussion 

The geometric properties of complex numbers were discovered long before 
the complex numbers themselves. Diophantus (already mentioned in Sec- 
tion 1.2) was aware of the two-square identity, and indeed he associated a 
sum of two squares, a 2 + b 2 , with the right-angled triangle with perpendicu- 
lar sides a and b. Thus, Diophantus was vaguely aware of two-dimensional 
objects (right-angled triangles) with a multiplicative property (of their hy- 
potenuses). Around 1590, Viete noticed that the Diophantus “product” 
of triangles with sides (a, b) and (c,d) — namely, the triangle with sides 
( ac — hd.bc + ad ) — also has an additive property, of angles (Figure 1.6). 
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Figure 1.6: The Diophantus “product” of triangles. 


The algebra of complex numbers emerged from the study of polyno- 
mial equations in the sixteenth century, particularly the solution of cu- 
bic equations by the Italian mathematicians del Ferro, Tartaglia, Cardano, 
and Bombelli. Complex numbers were not required for the solution of 
quadratic equations, because in the sixteenth century one could say that 
x 2 + 1 = 0, for example, has no solution. The formal solution x = \J — I 
was just a signal that no solution really exists. Cubic equations force the 
issue because the equation x 3 = px + q has solution 



(the “Cardano formula”). Thus, according to the Cardano formula the so- 
lution of x 3 = 15x + 4 is 

x= \J 2 + 'J l 7 - — & + \J 2— \fl 1 — 5 3 = \/2 + Mi + \/2 — 11/. 

But the symbol i = v — I cannot be signaling NO SOLUTION here, because 
there is an obvious solution x = 4. How can \/2 + 1 1/ + \J2 — 1 1/ be the 
solution when 4 is? 

In 1572, Bombelli resolved this conflict, and launched the algebra of 
complex numbers, by observing that 

(2 + /) 3 = 2 + 11 /, (2 — z ) 3 = 2 — 11 /, 
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and therefore 


v / 2+lli + v / 2- ll/ = (2 + i) + (2-i) =4, 


assuming that i obeys the same rules as ordinary, real, numbers. His calcu- 
lation was, in effect, an experimental test of the proposition that complex 
numbers form a field — a proposition that could not have been formulated, 
let alone proved, at the time. The first rigorous treatment of complex num- 
bers was that of Hamilton, who in 1835 gave definitions of complex num- 
bers, addition, and multiplication that make a proof of the field properties 
crystal clear. 

Hamilton defined complex numbers as ordered pairs z= (a,b) of real 
numbers, and he defined their sum and product by 


(ai,bi) + (a 2 ,b 2 ) = («i + a 2 ,b l + b 2 ), 

(ai,bi)(a 2 ,b 2 ) = (a\a 2 —b\b 2 ,a\b 2 + b\a 2 ). 


Of course, these definitions are motivated by the interpretation of (a,b) as 
a + ib, where i 2 = — 1 , but the important point is that the field properties 
follow from these definitions and the properties of real numbers. The prop- 
erties of addition are directly “inherited” from properties of real number 
addition. For example, for complex numbers zi = (a\.b\) and zz = (a 2 . b 2 ) 
we have 

Zi+Z 2 =Z 2 ~hZi 


because 


ai+a 2 = a 2 + ai and b{+b 2 = b 2 + b\ for real numbers a\,a 2 ,b\,b 2 . 

Indeed, the properties of addition are not special properties of pairs, they 
also hold for the vector sum of triples, quadruples, and so on. The field 
properties of multiplication, on the other hand, depend on the curious defi- 
nition of product of pairs, which has no obvious generalization to a product 
of / 2 -tuples for n > 2. 

This raises the question; is it possible to define a “product” operation on 
K" that, together with the vector sum operation, makes K” a field? Hamil- 
ton hoped to find such a product for each n. Indeed, he hoped to find a 
product with not only the field properties but also the multiplicative abso- 
lute value 

\uv\ = |«||v|, 
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where the absolute value of u = ( jci , JC 2 , ■ • • ,x„) is \u\ = ^Jx\ +x\ 4 \-x 2 . 

As we have seen, for n = 2 this property is equivalent to the Diophantus 
identity for sums of two squares, so a multiplicative absolute value in gen- 
eral implies an identity for sums of n squares. 

Hamilton attacked the problem from the opposite direction, as it were. 
He tried to define the product operation, first for triples, before worrying 
about the absolute value. But after searching fruitlessly for 13 years, he 
had to admit defeat. He still had not noticed that there is no three square 
identity, but he suspected that multiplying triples of the form a + bi + cj 
requires a new object k = ij. Also, he began to realize that there is no hope 
for the commutative law of multiplication. Desperate to salvage something 
from his 13 years of work, he made the leap to the fourth dimension. He 
took k = ij to be a vector perpendicular to 1, i, and j, and sacrificed the 
commutative law by allowing ij = —ji, jk = — kj , and ki = —ik. On Octo- 
ber 16, 1843 he had his famous epiphany that i, j, and k must satisfy 

i 2 = j 2 =k 2 = ijk = - 1. 

As we have seen in Section 1.3, these relations imply all the field prop- 
erties, except commutative multiplication. Such a system is often called 
a skew field (though this term unfortunately suggests a specialization of 
the field concept, rather than what it really is — a generalization). Hamil- 
ton’s relations also imply that absolute value is multiplicative — a fact he 
had to check, though the equivalent four-square identity was well known 
to number theorists. 

In 1878, Frobenius proved that the quaternion algebra H is the only 
skew field R' ! that is not a field, so Hamilton had found the only “algebra 
of /7-tuples” it was possible to find under the conditions he had imposed. 

The multiplicative absolute value, as stressed in Section 1.4, implies 
that multiplication by a quaternion of absolute value 1 is an isometry of 
M 4 . Hamilton seems to have overlooked this important geometric fact, and 
the quaternion representation of space rotations (Section 1.5) was first pub- 
lished by Cayley in 1845. Cayley also noticed that the corresponding for- 
mulas for transforming the coordinates of M 3 had been given by Rodrigues 
in 1840. Cayley’s discovery showed that the noncommutative quaternion 
product is a good thing, because space rotations are certainly noncommu- 
tative; hence they can be faithfully represented only by a noncommutative 
algebra. This finding has been enthusiastically endorsed by the computer 
graphics profession today, which uses quaternions as a standard tool for 
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rendering 3-dimensional motion. 

The quaternion algebra H plays two roles in Lie theory. On the one 
hand, H gives the most understandable treatment of rotations in M 3 and M 4 , 
and hence of the rotation groups of these two spaces. The rotation groups 
of M 3 and M 4 are Lie groups, and they illustrate many general features of 
Lie theory in a way that is easy to visualize and compute. On the other 
hand, H also provides coordinates for an infinite series of spaces H", with 
properties closely analogous to those of the spaces M" and C". In particular, 
we can generalize the concept of “rotation group” from M" to both C" and 
HI" (see Chapter 3). It turns out that almost all Lie groups and Lie algebras 
can be associated with the spaces M", C", or HI", and these are the spaces 
we are concerned with in this book. 

However, we cannot fail to mention what falls outside our scope: the 
8-dimensional algebra O of octonions. Octonions were discovered by a 
friend of Hamilton, John Graves, in December 1843. Graves noticed that 
the algebra of quaternions could be derived from Euler’s four-square iden- 
tity, and he realized that an eight-square identity would similarly yield a 
“product” of octuples with multiplicative absolute value. An eight-square 
identity had in fact been published by the Danish mathematician Degen in 
1818, but Graves did not know this. Instead, Graves discovered the eight- 
square identity himself, and with it the algebra of octonions. The octonion 
sum, as usual, is the vector sum, and the octonion product is not only non- 
commutative but also nonassociative. That is, it is not generally the case 
that u(vw ) = ( uv)w . 

The nonassociative octonion product causes trouble both algebraically 
and geometrically. On the algebraic side, one cannot represent octonions 
by matrices, because the matrix product is associative. On the geometric 
side, an octonion projective space (of more than two dimensions) is im- 
possible, because of a theorem of Hilbert from 1899. Hilbert’s theorem 
essentially states that the coordinates of a projective space satisfy the asso- 
ciative law of multiplication (see Hilbert [1971]). One therefore has only 
CD itself, and the octonion projective plane, OP 2 , to work with. Because of 
this, there are few important Lie groups associated with the octonions. But 
these are a very select few ! They are called the exceptional Lie groups, and 
they are among the most interesting objects in mathematics. Unfortunately, 
they are beyond the scope of this book, so we can mention them only in 
passing. 
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Preview 

This chapter begins by reviewing some basic group theory — subgroups, 
quotients, homomorphisms, and isomorphisms — in order to have a basis 
for discussing Lie groups in general and simple Lie groups in particular. 

We revisit the group § 3 of unit quaternions, this time viewing its rela- 
tion to the group SO(3) as a 2-to-l homomorphism. It follows that § 3 is 
not a simple group. On the other hand, SO(3) is simple, as we show by a 
direct geometric proof. 

This discovery motivates much of Lie theory. There are infinitely many 
simple Lie groups, and most of them are generalizations of rotation groups 
in some sense. However, deep ideas are involved in identifying the simple 
groups and in showing that we have enumerated them all. 

To show why it is not easy to identify all the simple Lie groups we 
make a special study of SO(4), the rotation group of M 4 . Like SO(3), 
SO(4) can be described with the help of quaternions. But a rotation of 
M 4 generally depends on two quaternions, and this gives SO (4) a special 
structure, related to the direct product of S 3 with itself. In particular, it 
follows that SO(4) is not simple. 


J. Stillwell, Naive Lie Theory , DOI: 10.1007/978-0-387-78214-0_2, 
© Springer Science+Business Media, LLC 2008 
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2.1 Crash course on groups 

For readers who would like a reminder of the basic properties of groups, 
here is a crash course, oriented toward the kind of groups studied in this 
book. Even those who have not seen groups before will be familiar with the 
computational tricks — such as canceling by multiplying by the inverse — 
since they are the same as those used in matrix computations. 

First, a group G is a set with “product” and “inverse” operations, and 
an identity element 1, with the following three basic properties: 

gl(g 2 g 3 ) = {glgl)g 3 for all gl,g2,g3 £ G, 

gl = lg = g for all g £ G, 

gg~'=g~ 1 g= 1 for all g£G. 

It should be mentioned that 1 is the unique element g’ such that gg’ = g 
for all g £ G, because multiplying the equation gg' = g on the left by g“ 1 
gives g' = 1 . Similarly, for each g £ G, g" 1 is the unique element g" such 
that gg” = 1. 

The above notation for “product,” “inverse,” and “identity” is called 
multiplicative notation. It is used (sometimes with 7, e, or 1 in place of 1) 
for groups of numbers, quaternions, matrices, and all other groups whose 
operation is called “product.” There are a few groups whose operation is 
called “sum,” such as M" under vector addition. For these we use additive 
notation : g\ + go for the “sum” of g\,g 2 £ G, — g for the inverse of g £ G, 
and 0 (or 0 ) for the identity of G. Additive notation is used only when G is 
abelian, that is, when gi+gi= g 2 + gi for all gi,gi £ G. 

Since groups are generally not abelian, we have to speak of multiplying 
h by g “on the left” or “on the right,” because gh and hg are generally 
different. If we multiply all members g' of a group G on the left by a 
particular' g £ G, we get back all the members of G, because for any g" £ G 
there is a g' £ G such that gg' = g" (namely g' = g 'g"). 

Subgroups and cosets 

To study a group G we look at the groups 77 contained in it, the subgroups 
of G. For each subgroup 77 of G we have a decomposition of G into disjoint 
pieces called the (left or right) cosets of 77 in G. The left cosets (which we 
stick with, for the sake of consistency) are the sets of the form 

gH = {gh : h £ 77}. 
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Thus H itself is the coset for g = 1, and in general a coset gH is “H trans- 
lated by g,” though one cannot usually take the word “translation” literally. 
One example for which this is literally true is G the plane M 2 of points 
(x,y) under vector addition, and H the subgroup of points (O.y). In this 
case we use additive notation and write the coset of (x,y) as 

(x,y) + H = {{x,y) : y £ R}, where x is constant. 

Then H is the y-axis and the coset (x.y) +H is H translated by the vector 
(x,y) (see Figure 2. 1). This example also illustrates how a group G decom- 


H 

(1,0) + // 

(2,0)+// 

0 

(1,0) 

(2,0) 


Figure 2.1: Subgroup H of R 2 and cosets. 

poses into disjoint cosets (decomposing the plane into parallel lines), and 
that different g £ G can give the same coset gH. For example, (1,0) -F // 
and (1, 1) +H are both the vertical line x = 1. 

Each coset gH is in 1-to-l correspondence with H because we get back 
each /; £ H from gh £ gH by multiplying on the left by g _1 . Different 
cosets are disjoint because if g £ g\H and g £ gi_H then 

g = gihi=g 2 h 2 for some /?i,/i 2 £ //, 

and therefore gi = g 2 ^ 2 l' l \ '• But then 


giH = gihih^H = g 2 (h 2 hj l H) = g 2 H 

because hjhj 1 £ H and therefore h 2 hj l H = H by the remark at the end of 
the last subsection (that multiplying a group by one of its members gives 
back the group). Thus if two cosets have an element in common, they are 
identical. 
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This algebraic argument has surprising geometric consequences; for 
example, a filling of § 3 by disjoint circles known as the Hopf fibration. 
Figure 2.2 shows some of the circles, projected stereographically into M 3 . 
The circles fill nested torus surfaces, one of which is shown in gray. 



Figure 2.2: Some circles in the Flopf fibration. 

Proposition: 8 3 can be decomposed into disjoint congruent circles. 

Proof. As we saw in Section 1.3, the quaternions a + bi + cj + dk of unit 
length satisfy 

a 2 + b 2 + c 2 + d 2 = 1 , 

and hence they form a 3-sphere § 3 . The unit quaternions also form a group 
G , because the product and inverse of unit quaternions are also unit quater- 
nions, by the multiplicative property of absolute value. 

One subgroup H of G consists of the unit quaternions of the form 
cos 6 +isin0, and these form a unit circle in the plane spanned by 1 and 
i. It follows that any coset qH is also a unit circle, because multiplica- 
tion by a quaternion q of unit length is an isometry, as we saw in Section 
1.4. Since the cosets qH fill the whole group and are disjoint, we have a 
decomposition of the 3-sphere into unit circles. □ 


2.2 Crash course on homomorphisms 
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Exercises 

An important nonabelian group (in fact, it is the simplest example of a nonabelian 
Lie group) is the group of functions of the form 

fa.b(x) = ax+b, where a,b £ R and a > 0. 

The group operation is function composition. 

2 . 1.1 If f a ,b{x) = fa 2 ,bo(fa 1 ,b l ( x ))’ wor k out a i b i n terms of ai,bi,ci2,b2, and 
check that they are the same as the a,b determined by 

(a b\ _ (d 2 b 2 \ ( ci\ 6i\ 

\° V l )' 


2 . 1.2 Also show that the inverse function / (x) exists, and that it corresponds to 

the inverse matrix 



This correspondence between functions and matrices is a matrix representation of 
the group of functions f a j, . We have already seen examples of matrix representa- 
tions of groups — such as the rotation groups in two and three dimensions — and, 
in fact, most of the important Lie groups can be represented by matrices. 

The unit complex numbers, cos G + /sin 0, form a group SO(2) that we began 
to study in Section 1.1. We now investigate its subgroups. 

2 . 1.3 Other than the trivial group {1}, what is the smallest subgroup of SO(2)? 

2 . 1.4 Show that there is exactly one u-element subgroup of SO(2), for each natu- 
ral number n, and list its members. 

2 . 1.5 Show that the union R of all the finite subgroups of SO(2) is also a subgroup 
(the group of “rational rotations”). 

2 . 1.6 If z is a complex number not in the group R described in Exercise 2.1.5, 
show that the numbers . . . ,z~ 2 ,z~ 1 , l,z,z 2 ,- - ■ are all distinct, and that they 
form a subgroup of SO(2). 


2.2 Crash course on homomorphisms 

Normal subgroups 

Since hg / gh in general, it can also be that gH / Hg, where 


Hg={hg:heH} 
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is the right coset of H. If gH = Hg for all g £ G, we say that H is a normal 
subgroup of G. An equivalent statement is that H equals 

g~ l Hg = {g~ l hg : h E H} for each g € G. 

(Because of this, it would be more sensible to call H “self-conjugate but 
unfortunately the overused word “normal” has stuck.) 

The good thing about a normal subgroup H is that its cosets themselves 
form a group when “multiplied” by the rule that “the coset of gi, times the 
coset of g 2 , equals the coset of gig 2 ”- 


g\H -g 2 H = gig 2 H. 


This rule makes sense for a normal subgroup H because if g\H = g\H and 
g' 2 H = SiH then g\g' 2 H = gig 2 H as follows: 


g\g' 2 H = g\Hg' 2 
= giHg' 2 
= gig'i H 
= gigiH 


since g 2 H = Hg 2 by normality, 
since g\H = g\H by assumption, 
since g 2 H = Hg 2 by normality, 
since g 2 H = g 2 H by assumption. 


The group of cosets is called the quotient group of G by H , and is 
written G/H. (When G and H are finite, the size of G/H is indeed the size 
of G divided by the size of H.) We reiterate that the quotient group G/H 
exists only when H is a normal subgroup. Another, more efficient, way to 
describe this situation is in terms of homomorphisms : structure -preserving 
maps from one group to another. 


Homomorphisms and isomorphisms 

When H is a normal subgroup of G, the map (p : G —> G/H defined by 
( P{g) = gH for all geG 
preserves products in the sense that 

<P(glg2) = <P0l)-<P(g2)- 

This follows immediately from the definition of product of cosets, because 
<P(gig2) = g\giH = g\H ■ g 2 H = (p{gi) • <p(g2) ■ 
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In general, a map <p : G — » G' of one group into another is called a ho- 
momorphism (from the Greek for “similar form”) if it preserves products. 
A group homomorphism indeed preserves group structure, because it not 
only preserves products, but also the identity and inverses. Here is why: 

• Since g = lg for any g G G, we have 

c p(g ) = <p(lg) = <p(l)<p(g) because <p preserves products. 
Multiplying both sides on the right by <p(g) _1 then gives 1 = <p(l). 

• Since 1 = gg“ 1 for any g G G, we have 

1 = «p(l) = (p(gg - 1 ) = q>(g)q>(g ~ l ) 

because <p preserves products. 

This says that <p(g _1 ) = (5(g) -1 , because the inverse of <p(g) is 
unique. 

Thus the image <p(G) is of “similar” form to G, but we say that G' is 
isomorphic (of the “same form”) to G only when the map (p is 1-to-l and 
onto (in which case we call <p an isomorphism ). In general, <p(G) is only a 
shadow of G, because many elements of G may map to the same element 
of G' . The case furthest from isomorphism is that in which (p sends all 
elements of G to 1 . 

Any homomorphism <p of G onto G' can be viewed as the special type 
(p : G — > G/H. The appropriate normal subgroup H of G is the so-called 
kernel of (p: 

H = ker <p = {g <E G : <p(g) = 1}. 

Then G' is isomorphic to the group G/ ker (p of cosets of ker (p because: 

1. ker (p is a group, because 

h] . /i 2 G ker (p =>• (p[h\) = (pfhf) = 1 
=> (p{hi)(p{h 2 ) = 1 
=> qo (/? i ) = 1 
=>■ h\h 2 G ker cp 
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and 


h G ker <p =>• tp{h) = 1 

=> <p(/i) _1 = 1 
=> ( p(hr x ) = l 
=>• /z _1 G ker <p. 


2. ker <p is a normal subgroup of G, because, for any g G G, 

h G ker (p =>• <p(g/ig _1 ) = (p{g)(p{h)(p{g~ 1 ) = (p(g)lcp(g)~ 1 = 1 
=>• g/jg~ ! G ker (p. 

Hence g(ker <p)g _1 = ker (p, that is, ker <p is normal. 

3. Each g' = (p(g) G G 7 corresponds to the coset g(ker (p). 

In fact, g(ker <p) = <p 1 (g 7 ), because 

k G ip _1 (g 7 ) <p(k) = g 7 (definition of <p _1 ) 

<P(*) = (pis ) 

^9(g)~X*) = 1 

<^<p(g _1 k) = 1 
44g _1 k G ker <p 
k G g(ker (p). 

4. Products of elements of g 7 , . g 7 2 G G 7 coiTespond to products of the 
coiTesponding cosets: 

g\=(P(gi),g2 = (p(82) =» 9^ 1 tei)=gt(ker (p),^' 1 (g2)=g 2 (ker <p) 
by step 3. But also 

g'l = <p(gl),g2 = (Pigl) => gig 2 = (p{gl)(p(g2 ) = <P(glg2) 

=> (p~ l ig[g 2 ) =gtg 2 (ker <p), 

also by step 3. Thus the product g\g 2 corresponds to gig 2 (ker (p), 
which is the product of the cosets corresponding to g\ and g' 2 respec- 
tively. 

To sum up: a group homomorphism (p of G onto G' gives a 1-to-l corre- 
spondence between G' and G/( ker (p) that preserves products, that is, G' 
is isomorphic to G/( ker <p ). 

This result is called the fundamental homomorphism theorem for 
groups. 


2.2 Crash course on homomorphisms 
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The det homomorphism 

An important homomorphism for real and complex matrix groups G is the 
determinant map 

det : (7 — > C x , 

where C x denotes the multiplicative group of nonzero complex numbers. 
The determinant map is a homomorphism because det is multiplicative — 
det(Afi) = det (A) det(fi) — a fact well known from linear algebra. 

The kernel of det, consisting of the matrices with determinant 1, is 
therefore a normal subgroup of G. Many important Lie groups arise in 
precisely this way, as we will see in Chapter 3. 

Simple groups 

A many-to-1 homomorphism of a group G maps it onto a group G' that 
is “simpler” than G (or, at any rate, not more complicated than G). For 
this reason, groups that admit no such homomorphism, other than the ho- 
momorphism sending all elements to 1, are called simple. Equivalently, a 
nontrivial group is simple if it contains no normal subgroups other than 
itself and the trivial group. 

One of the main goals of group theory in general, and Lie group theory 
in particular, is to find all the simple groups. We find the first interesting 
example in the next section. 

Exercises 

2 . 2.1 Check that z i— > z 2 is a homomorphism of S 1 . What is its kernel? What are 
the cosets of the kernel? 

2 . 2.2 Show directly (that is, without appealing to Exercise 2.2.1) that pairs {±z a }, 
where Za = cos a + i sin a, form a group G when pairs are multiplied by the 
rule 

{±Z„} • {±Zp} = (±(ZaZj3)}- 

Show also that the function <p : 8 1 — > G that sends both z. a - —Za & S 1 to the 
pair {±Zce} is a 2-to-l homomorphism. 

2 . 2.3 Show that z i— > z 2 is a well-defined map from G onto S 1 , where G is the 
group described in Exercise 2.2.2, and that this map is an isomorphism. 

The space that consists of the pairs {±z«} of opposite (or “antipodal”) points 
on the circle is called the real projective line RP 1 . Thus the above exercises 
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show that the real projective line has a natural group structure, under which it is 
isomorphic to the circle group S 1 . 

In the next section we will consider the real projective space RP 3 , consisting 
of the antipodal point pairs { ±q\ on the 3-sphere § 3 . These pairs likewise have 
a natural product operation, which makes RP 3 a group — in fact, it is the group 
SO(3) of rotations of R 3 . We will show that RP 3 is not the same group as S 3 , 
because SO(3) is simple and § 3 is not. 

We can see right now that § 3 is not simple, by finding a nontrivial normal 
subgroup. 

2.2.4 Show that {±1} is a normal subgroup of S 3 . 

However, it turns out that {±1} is the only nontrivial normal subgroup of § 3 . 
In particular, the subgroup S 1 that we found in Section 2. 1 is not normal. 

2.2.5 Show that S 1 is not a normal subgroup of S 3 . 

2.3 The groups SU(2) and SO(3) 

The group SO(2) of rotations of M 2 about O can be viewed as a geometric 
object, namely the unit circle in the plane, as we observed in Section 1.1. 

The unit circle, S 1 , is the first in the series of unit n-spheres §”, the nth 
of which consists of the points at distance 1 from the origin in M" +1 . Thus 
§ 2 is the ordinary sphere, consisting of the points at distance 1 from the 
origin in M 3 . Unfortunately (for those who would like an example of an 
easily visualized but nontrivial Lie group) there is no rule for multiplying 
points that makes § 2 a Lie group. In fact, the only other Lie group among 
the n-spheres is § 3 . As we saw in Section 1.3, it becomes a group when 
its points are viewed as unit quaternions, under the operation of quaternion 
multiplication. 

The group § 3 of unit quaternions can also be viewed as the group of 
2x2 complex matrices of the form 

a + di —b — ci\ . 

, . , where det(<2) = 1, 

b — ci a — di J y ’ ’ 

because these are precisely the quaternions of absolute value 1. Such matri- 
ces are called unitary, and the group 8 3 is also known as the special unitary 
group SU(2). Unitary matrices are the complex counterpart of orthogonal 
matrices, and we study the analogy between the two in Chapters 3 and 4. 

The group SU(2) is closely related to the group SO(3) of rotations 
of M 3 . As we saw in Section 1.5, rotations of M 3 correspond 1-to-l to 
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the pairs ±f of antipodal unit quaternions, the rotation being induced on 
Mi + Mj + Mk by the conjugation map q i— > t ] ql. Also, the group operation 
of SO(3) corresponds to quaternion multiplication, because if one rotation 
is induced by conjugation by t\, and another by conjugation by ti, then 
conjugation by fi ?2 induces the product rotation (first rotation followed by 
the second). Of course, we multiply pairs ± t of quaternions by the rule 


(±fi)(±f2) = ±V2- 

We therefore identify SO(3) with the group MIP 3 of unit quaternion 
pairs ±t under this product operation. The map (p : SU(2) — > SO(3) defined 
by (p(t) = {±t} is a 2-to-l homomorphism, because the two elements t and 
—t of SU(2) go to the single pair ± t in SO(3). Thus SO(3) looks “simpler” 
than SU(2) because SO(3) has only one element where SU(2) has two. 
Indeed, SO(3) is “simpler” because SU(2) is not simple — it has the normal 
subgroup {±1} — and SO(3) is. We now prove this famous property of 
SO(3) by showing that SO(3) has no nontrivial normal subgroup. 

Simplicity of SO(3). The only nontrivial subgroup of S 0(3) closed under 
conjugation is SO (3) itself 

Proof. Suppose that H is a nontrivial subgroup of S0(3), so H includes a 
nontrivial rotation, say the rotation h about axis l through angle a. 

Now suppose that H is normal, so H also includes all elements g~ l * hg 
for g £ S0(3). If g moves axis / to axis m, then g~ 1 hg is the rotation about 
axis m through angle a. (In detail, g~ l moves m to l, h rotates through 
angle a about /, then g moves / back to m.) Thus the normal subgroup H 
includes the rotations through angle a about all possible axes. 

Now a rotation through a about P, followed by rotation through a 
about Q, equals rotation through angle 8 about R, where R and 8 are as 
shown in Figure 2.3. As in Exercise 1.5.6, we obtain the rotation about 
P by successive reflections in the great circles PR and PQ, and then the 
rotation about Q by successive reflections in the great circles PQ and QR. 
In this sequence of four reflections, the reflections in PQ cancel out, leaving 
the reflections in PR and QR that define the rotation about R. 

As P varies continuously over some interval of the great circle through 
P and Q , 8 varies continuously over some interval. (R may also vary, but 
this does not matter.) It follows that 8 takes some value of the form 

1 ^~ , where m is odd, 

n 
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because such numbers are dense in M. The /i-fold product of this rotation 
also belongs to H, and it is a rotation about R through mn, where m is odd. 
The latter rotation is simply rotation through n, so H includes rotations 
through n about any point on the sphere (by conjugation with a suitable g 
again). 

Finally, taking the product of rotations with a /2 = n /2 in Figure 2.3, 
it is clear that we can get a rotation about R through any angle 6 between 
0 and 2k. Flence H includes all the rotations in SO(3). □ 

Exercises 

Like SO(2), SO(3) contains some finite subgroups. It contains all the finite sub- 
groups of SO(2) in an obvious way (as rotations of R 3 about a fixed axis), but 
also three more interesting subgroups called the polyhedral groups. Each poly- 
hedral group is so called because it consists of the rotations that map a regular 
polyhedron into itself. 

Here we consider the group of 12 rotations that map a regular tetrahedron 
into itself. We consider the tetrahedron whose vertices are alternate vertices of the 
unit cube in Ri + Rj + Rk, where the cube has center at O and edges parallel to 
the i, j, and k axes (Figure 2.4). 

First, let us see why there are indeed 12 rotations that map the tetrahedron 
into itself. To do this, observe that the position of the tetrahedron is completely 
determined when we know 


• Which of the four faces is in the position of the front face in Figure 2.4. 
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• Which of the three edges of that face is at the bottom of the front face in 
Figure 2.4. 



Figure 2.4: The tetrahedron and the cube. 


2 . 3.1 Explain why this observation implies 12 possible positions of the tetrahe- 
dron, and also explain why all these positions can be obtained by rotations. 

2 . 3.2 Similarly, explain why there are 24 rotations that map the cube into itself 
(so the rotation group of the tetrahedron is different from the rotation group 
of the cube). 


The 12 rotations of the tetrahedron are in fact easy to enumerate with the help 
of Figure 2.5. As is clear from the figure, the tetrahedron is mapped into itself by 
two types of rotation: 

• A 1/2 turn about each line through the centers of opposite edges. 

• A 1/3 turn about each line through a vertex and the opposite face center. 

2.3.3 Show that there are 11 distinct rotations among these two types. What 
rotation accounts for the 12th position of the tetrahedron? 


Now we make use of the quaternion representation of rotations from Section 
1.5. Remember that a rotation about axis u through angle 6 corresponds to the 
quaternion pair ±q, where 

e . e 

q = cos — b u sin — . 

2 2 

2 . 3.4 Show that the identity, and the three 1/2 turns, correspond to the four quater- 
nion pairs ±l,±i,±j,±k. 
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Figure 2.5: The tetrahedron and axes of rotation. 


2.3.5 


Show that the 1/3 turns correspond to the eight antipodal pairs among the 
16 quaternions 


1 i j k 

± 2 ± 2 ± 2 ± 2 ' 


The 24 quaternions obtained in Exercises 2.3.4 and 2.3.5 form an exceptionally 
symmetric configuration in R 4 . They are the vertices of a regular figure called the 
24 -cell, copies of which form a “tiling” of R 4 . 


2.4 Isometries of R n and reflections 

In this section we take up an idea that appeared briefly in the exercises 
for Section 1.5: the representation of isometries as products of reflections. 
There we showed that certain isometries of M 2 and M 3 are products of 
reflections. Here we represent isometries of M” as products of reflections, 
and in the next section we use this result to describe the rotations of M 4 . 

We actually prove that any isometry o/M" that fixes O is the product 
of reflections in hyperplanes through O, and then specialize to orientation- 
preser\’ing isometries. A hyperplane H through O is an (n — 1) -dimensional 
subspace of M", and reflection in H is the linear map of M" that fixes the 
elements in H and reverses the vectors orthogonal to H . 


2.4 Isometries of M" and reflections 
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Reflection representation of isometries. Any isometry ofW 1 that fixes O 
is the product of at most n reflections in hyperplanes through O. 

Proof. We argue by induction on n. For n = 1 the result is the obvious one 
that the only isometries of M fixing O are the identity and the map x i— > —x, 
which is reflection in O. 

Now suppose that the result is true for n = k — 1 and that / is an isom- 
etry of M /( fixing O. If / is not the identity, suppose that v 6 I* is such 
that /(v) = w v. Then the reflection r u in the hyperplane orthogonal to 
u = v — w maps the subspace Km of real multiples of u onto itself and the 
map r u f (“/ followed by r„”) is the identity on the subspace Mm of K /f . 

The restriction of r u f to the R k 1 orthogonal to Ru is, by induction, 
the product of < k — 1 reflections. It follows that / = r u g, where g is the 
product of < k — 1 reflections. 

Therefore, / is the product of < k reflections, and the result is true for 
all n by induction. □ 

It follows in particular that any orientation-preserving isometry of M 3 
is the product of 0 or 2 reflections (because the product of an odd number 
of reflections reverses orientation). Thus any such isometry is a rotation 
about an axis passing through O. 

This theorem is sometimes known as the Cartan-Dieudonne theorem, 
after a more general theorem proved by Cartan [1938], and generalized 
further by Dieudonne. Cartan’s theorem concerns “reflections” in spaces 
with real or complex coordinates, and Dieudonne’s extends it to spaces 
with coordinates from finite fields. 

Exercises 

Assuming that reflections are linear, the representation of isometries as products 
of reflections shows that all isometries fixing the origin are linear maps. In fact, 
there is nice direct proof that all such isometries (including reflections) are linear, 
pointed out to me by Marc Ryser. We suppose that / is an isometry that fixes O, 
and that u and v are any points in R". 

2 . 4.1 Prove that / preserves straight lines and midpoints of line segments. 

2 . 4.2 Using the fact that u + v is the midpoint of the line joining 2 u and 2v, and 
Exercise 2.4.1, show that f(u + v) = f{u) +/(v). 

2 . 4.3 Also prove that f(ru) = rf{u) for any real number r. 

It is also true that reflections have determinant —1, hence the determinant detects 
the “reversal of orientation” effected by a reflection. 
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2.4.4 Show that reflection in the hyperplane orthogonal to a coordinate axis has 
determinant —1, and generalize this result to any reflection. 


2.5 Rotations of M 4 and pairs of quaternions 

A linear map is called orientation-preserving if its determinant is positive, 
and orientation-reversing otherwise. Reflections are linear and orientation- 
reversing, so a product of reflections is orientation-preserving if and only 
if it contains an even number of terms. We define a rotation ofW about O 
to be an orientation-preserving isometry that fixes O. 

Thus it follows from the Cartan-Dieudonne theorem that any rotation 
of M 4 is the product of 0, 2, or 4 reflections. The exact number is not impor- 
tant here — what we really want is a way to represent reflections by quater- 
nions, as a stepping-stone to the representation of rotations by quaternions. 
Not surprisingly, each reflection is specified by the quaternion orthogonal 
to the hyperplane of reflection. More surprisingly, a rotation is specified 
by just two quaternions, regardless of the number of reflections needed to 
compose it. Our proof follows Conway and Smith [2003], p. 41. 

Quaternion representation of reflections. Reflection of H = M 4 in the 

hyperplane through O orthogonal to the unit quaternion u is the map that 
sends each q 6 H to —uqu. 

Proof. First observe that the map q i— ► —uqu is an isometry. This is because 

• q i— > —q reverses the real part of q and keeps the imaginary part fixed, 
hence it is reflection in the hyperplane spanned by i, j, and k. 

• Multiplication on the left by the unit quaternion u is an isometry 
by the argument in Section 1.4, and there is a similar argument for 
multiplication on the right. 

Next notice that the map q i— > —uqu sends 

vu to — u(vu)u = —uuvu because (vu) =uv, 

= —vu because uu = \u\ 2 = 1. 

In particular, the map sends u to — u, so vectors parallel to u are reversed. 
And it sends in to in, because i = — i, and similarly jn to jn and kn to kn. 
Thus the vectors in, jn, and kn, which span the hyperplane orthogonal to 
n, are fixed. Hence the map q i— > —uqu is reflection in this hyperplane. □ 
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Quaternion representation of rotations. Any rotation of H = M 4 about 
O is a map of the form q i— > vqw, where v and w are unit quaternions. 

Proof. It follows from the quaternion representation of reflections that the 
result of successive reflections in the hyperplanes orthogonal to the unit 
quaternions tq , t< 2 , ■ • ■ , U 2 „ is the map 


q l — » U2n ' ' ' M3M2M | q U1U2U2, • • • U2n, 

because an even number of sign changes and conjugations makes no 
change. The pre- and postmultipliers are in general two different unit 
quaternions, 112 ,, ■ • • Tifufdf = v and tqtqnj ■ ■ ■ U 2 n = w, say, so the general 
rotation of M 4 is a map of the form 


q 1 — > vqw. where v and w are unit quaternions. 


Conversely, any map of this form is a rotation, because multiplication 
o/H = M 4 on either side by a unit quaternion is an orientation-preserving 
isometry. We already know that multiplication by a unit quaternion is an 
isometry, by Section 1.4. And it preserves orientation by the following 
argument. 

Multiplication of H = M 4 by a unit quaternion 

a Aid —b — ic\ , ? ,2 9 ,9 . 

, . . , , where a~ + b 2 + c 1 + d~ = 1 , 

b — ic a — id J 



is a linear transformation of M 4 with matrix 


R v = 


a 

-d 

—b 

c 

d 

a 

—c 

—b 

b 

c 

a 

d 

-c 

b 

-d 

a 


where the 2 x 2 submatrices represent the complex-number entries in v. It 
can be checked that del (/?,.) = 1. So multiplication by v, on either side, 
preserves orientation. □ 


Exercises 

The following exercises study the rotation q 1— > iq of H = K 4 , first expressing it as a 
product of “plane rotations” — of the planes spanned by 1 , i and j. k respectively — 
then breaking it down to a product of four reflections. 
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2.5.1 Check that q^riq sends 1 to i, i to — 1 and j to k, k to j. How many points 
of JR 4 are fixed by this map? 

2.5.2 Show that the rotation that sends 1 to i, i to —1 and leaves j, k fixed is 
the product of reflections in the hyperplanes orthogonal to u\ = i and iii = 
(i-l)/V5. 

2.5.3 Show that the rotation that sends j to k, k to — j and leaves 1, i fixed is the 
product of reflections in the hyperplanes orthogonal to M3 = k and M4 = 

(k-j)/V2. 

It follows, by the formula q 1— > —uqu for reflection, that the product of rota- 
tions in Exercises 2 . 5.2 and 2 . 5.3 is the product 


q 1— > M4M3M2M1 q M1M2M3M4 


of reflections in the hyperplanes orthogonal to u \ , M2, M3 , M4 respectively. 

2.5.4 Check that M4M3M2MT = i and MTM2M3H4 = 1 , so the product of the four re- 
flections is indeed q 1— > iq. 


2.6 Direct products of groups 

Before we analyze rotations of M 4 from the viewpoint of group theory, it is 
desirable to review the concept of direct product or Cartesian product of 
groups. 

Definition. If A and B are groups then their direct product Ax B is the set 
of ordered pairs (a,b), where a £ A and b € B, under the “product of pairs” 
operation defined by 


(ai,b l )(a 2 ,b 2 ) = (a\a 2 ,bib 2 ). 

It is easy to check that this product operation is associative, that the 
identity element of A x B is the pair ( 1,4 , 1 g), where 1 4 is the identity of 
A and lg is the identity of B, and that ia.b) has inverse (a~ ] .b '). Thus 
Ax B is indeed a group. 

Many important groups are nontrivial direct products; that is, they have 
the form Ax B where neither A nor B is the trivial group { 1 }. For example: 

• The group M 2 , under vector addition, is the direct product Mxl. 

More generally, M" is the 72-fold direct product M x M x • • • x M. 
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• If A and B are groups of n x n matrices, then the matrices of the form 

where a £A and b G B, 

make up a group isomorphic to A x B under matrix multiplication, 
where 0 is the n x n zero matrix. This is because of the so-called 
block multiplication of matrices, according to which 

ba\ 0\ /a 2 0 \ _ ( a i a 2 0 \ 

VO bj Vo b 2 ) = V 0 ^bi)' 

• It follows, from the previous item, that M" is isomorphic to a 2« x 2 n 
matrix group, because M is isomorphic to the group of matrices 

where x G M. 

• The group S 1 xS 1 is a group called the (two-dimensional) torus T 2 . 
More generally, the n-fold direct product of S 1 factors is called the 
n-dimensional torus T". 

We call S 1 xS 1 a torus because its elements (0,0), where 0,0 e S 1 , 
can be viewed as the points on the torus surface (Figure 2.6). 





Since the groups M and s 1 are abelian, the same is true of all their 
direct products M m x T” . It can be shown that the latter groups include all 
the connected abelian matrix Lie groups. 
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Exercises 

If we let xi,X 2 ,X 3 ,X 4 be the coordinates along mutually orthogonal axes in R 4 , 
then it is possible to “rotate” the X[ and x 2 axes while keeping the X 3 and X 4 axes 
fixed. 

2.6.1 Write a 4 x 4 matrix for the transformation that rotates the (xi,X 2 )-plane 
through angle 6 while keeping the X3- and X 4 -axes fixed. 

2.6.2 Write a 4 x 4 matrix for the transformation that rotates the (x 3 ,X 4 )-plane 
through angle cj) while keeping the x\- and X 2 -axes fixed. 

2.6.3 Observe that the rotations in Exercise 2.6.1 form an S 1 , as do the rotations 
in Exercise 2.6.2, and deduce that SO(4) contains a subgroup isomorphic 
toT 2 . 

The groups of the form R"' x T" may be called “generalized cylinders,” based 
on the simplest example MxS 1 . 

2.6.4 Why is it appropriate to call the group RxS 1 a cylinder? 

The notation §" is unfortunately not compatible with the direct product nota- 
tion (at least not the way the notation R" is). 

2.6.5 Explain why § 3 = SU(2) is not the same group as S'xS'xS 1 . 


2.7 The map from SU(2) x SU(2) to SO(4) 

In Section 2.5 we showed that the rotations of M 4 are precisely the maps 
q 1 — > vqw, where v and w run through all the unit quaternions. Since v” 1 
is a unit quaternion if and only if v is, it is equally valid to represent each 
rotation of M 4 by a map of the form q 1 — » v~ x qw, where v and w are unit 
quaternions. The latter representation is more convenient for what comes 
next. 

The pairs of unit quaternions (v,w) form a group under the operation 
defined by 

(vi , Wl) • (v 2 ,W 2 ) = (viV 2 ,WlW 2 ), 

where the products viv 2 and w 1 i-v 2 on the right side are ordinary quaternion 
products. Since the v come from the group SU(2) of unit quaternions, and 
the w likewise, the group of pairs (v, w) is the direct product SU(2) x SU(2) 
o/SU(2) with itself. 

The map that sends each pair (v,w) G SU(2) x SU(2) to the rotation 
q 1 — * v~ l qw in SO(4) is a homomorphism <p : SU(2) x SU(2) — ► SO(4). 
This is because 
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• the product of the map q i— > 1 q\v \ corresponding to (vi,wi) 

• with the map q i— > v^ l qwi corresponding to (v 2 , w 2 ) 

• is the map q i— > vf l v^ l qw\W 2 , 

• which is the map q i— > (v\V 2 )~ x q(w \W 2 ) corresponding to the product 
(vi V 2 j vvi W 2 ) of (vi , wi) and (v 2 ,W 2 ). 

This homomorphism is onto SO (4), because each rotation of M 4 can 
be expressed in the form q t— > v~ l qw, but one might expect it to be very 
many-to-one, since many pairs (v, w) of unit quaternions conceivably give 
the same rotation. Surprisingly, this is not so. The representation of ro- 
tations by pairs is “unique up to sign” in the following sense: if (v,w) 
gives a certain rotation, the only other pair that gives the same rotation is 
(-v,-w). 

To prove this, it suffices to prove that the kernel of the homomorphism 
( p : SU(2) x SU(2) — > SO(4) has two elements. 

Size of the kernel. The homomorphism (p : SU(2) x SU(2) — » SO(4) is 
2 -to- 1, because its kernel has two elements. 

Proof. Suppose that (v,w) is in the kernel, so q ^ v ' 1 qw is the identity 
rotation. In particular, this rotation fixes 1, so 

v^ 1 lw=l; hence v = w. 

Thus the map is in fact q 1 — ► v -1 gv, which we know (from Section 1 .5) fixes 
the real axis and rotates the space of pure imaginary quaternions. Only if 
v = 1 or v = — 1 does the map fix everything; hence the kernel of (p has 
only two elements, (1,1) and (—1,-1). 

The left cosets of the kernel are therefore the 2-element sets 

(v,w)(±l,±l) = (±v,±w), 

and each coset corresponds to a distinct rotation of M 4 , by the fundamental 
homomorphism theorem of Section 2.2. □ 

This theorem shows that SO(4) is “almost” the same as SU(2) x SU(2), 
and the latter is far from being a simple group. For example, the subgroup 
of pairs (v, 1 ) is a nontrivial normal subgroup, but clearly not the whole of 
SU(2) x SU(2). This gives us a way to show that SO(4) is not simple. 
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SO(4) is not simple. There is a nontrivial normal subgroup o/SO(4), not 
equal to SO (4). 

Proof. The subgroup of pairs (v, 1) E SU(2) x SU(2) is normal; in fact, it 
is the kernel of the map (v,w) i— > (l,w), which is clearly a homomorphism. 

The corresponding subgroup of SO (4) consists of maps of the form 
q i— » v 1 ty I , which likewise form a normal subgroup of SO(4). But this 
subgroup is not the whole of SO(4). For example, it does not include the 
map g i— > qw for any w / ± 1 , by the “unique up to sign” representation of 
rotations by pairs (v, w ) . □ 

Exercises 

An interesting subgroup Aut(H) of SO(4) consists of the continuous automor- 
phisms of H = R 4 . These are the continuous bijections p : El — > H that preserve 
the quaternion sum and product, that is, 

p(p + q) = p(p) + p(q), p{pq)=p{p)p{q) forany p,q£U. 

It is easy to check that, for each unit quaternion u, the p that sends q i— > u 1 qu 
is an automorphism (first exercise), so it follows from Section 1.5 that Aut(H) 
includes the SO(3) of rotations of the 3-dimensional subspace Ri + Rj + Rk of 
pure imaginary quaternions. The purpose of this set of exercises is to show that 
all continuous automorphisms of H are of this form, so Aut(H) = SO(3). 

2.7.1 Check that q i— > u 1 qu is an automorphism of El for any unit quaternion u. 
Now suppose that p is any automorphism of EL 

2.7.2 Use the preservation of sums by an automorphism p to deduce in turn that 

• p preserves 0, that is, p(0) = 0, 

• p preserves differences, that is, p(p — q) = p(p) — p(q). 

2.7.3 Use preservation of products to deduce that 

• p preserves 1, that is, p(l) = 1, 

• p preserves quotients, that is, p{p/q) = p{p)/p{q) for q ^ 0. 

2.7.4 Deduce from Exercises 2.7.2 and 2.7.3 that p ( tn/n ) = m/n for any integers 
m and n / 0. This implies p(r) = r for any real r, and hence that p is a 
linear map of R 4 . Why? 

Thus we now know that a continuous automorphism p is a linear bijection 
of R 4 that preserves the real axis, and hence p maps Ri + Rj + Rk onto itself. It 
remains to show that the restriction of p to Ri + Rj + Rk is a rotation, that is, an 
orientation-preserving isometry, because we know from Section 1.5 that rotations 
of Ri + Rj + Rk are of the form q i—> 1 qu. 
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2.7.5 Prove in turn that 

• p preserves conjugates, that is, p(q) = p (q), 

• p preserves distance, 

• p preserves inner product in Mi + Mj + Mk, 

• p(p x q) = p(p) x p(q) in Mi + Mj + Mk, and hence p preserves 
orientation. 

The appearance of SO(3) as the automorphism group of the quaternion al- 
gebra H suggests that the automorphism group of the octonion algebra O might 
also be of interest. It turns out to be a 14-dimensional group called G 2 — the first 
of the exceptional Lie groups mentioned (along with O) in Section 1.6. This link 
between O and the exceptional groups was pointed out by Cartan [1908]. 


2.8 Discussion 

The concept of simple group emerged around 1830 from Galois’s theory 
of equations. Galois showed that each polynomial equation has a finite 
group of “symmetries” (permutations of its roots that leave its coefficients 
invariant), and that the equation is solvable only if its group decomposes 
in a certain way. In particular, the general quintic equation is not solvable 
because its group contains the nonabelian simple group A 5 — the group of 
even permutations of five objects. The same applies to the general equation 
of any degree greater than 5, because A n , the group of even permutations 
of n objects, is simple for any n > 5. 

With this discovery, Galois effectively closed the classical theory of 
equations, but he opened the (much larger) theory of groups. Specifi- 
cally, by exhibiting the nontrivial infinite family A n for n > 5, he raised 
the problem of finding and classifying all finite simple groups. This prob- 
lem is much deeper than anyone could have imagined in the time of Galois, 
because it depends on solving the corresponding problem for continuous 
groups, or Lie groups as we now call them. 

Around 1870, Sophus Lie was inspired by Galois theory to develop an 
analogous theory of differential equations and their “symmetries,” which 
generally form continuous groups. As with polynomial equations, simple 
groups raise an obstacle to solvability. However, at that time it was not 
clear what the generalization of the group concept from finite to continuous 
should be. Lie understood continuous groups to be groups generated by 
“infinitesimal” elements, so he thought that the rotation group of M 3 should 
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include “infinitesimal rotations.” Today, we separate out the “infinitesimal 
rotations” of M 3 in a structure called so(3), the Lie algebra of SO(3). The 
concept of simplicity also makes sense for so(3), and is somewhat easier 
to establish. Indeed, the infinites im al elements of any continuous group G 
form a structure g now called the Lie algebra of G, which captures most 
of the structure of G but is easier to handle. We discuss “infinitesimal 
elements,” and their modern counterparts, further in Section 4.3. 

It was a stroke of luck (or genius) that Lie decided to look at infinitesi- 
mal elements, because it enabled him to prove simplicity for whole infinite 
families of Lie algebras in one fell swoop. (As we will see later, most of 
the corresponding continuous groups are not quite simple, and one has to 
tease out certain small subgroups and quotient by them.) Around 1885 Lie 
proved results so general that they cover all but a finite number of simple 
Lie algebras — namely, those of the exceptional groups mentioned at the 
end of Chapter 1 (see Hawkins [2000], pp. 92-98). 

In the avalanche of Lie’s results, the special case of so(3) and SO(3) 
seems to have gone unnoticed. It gradually came to light as twentieth- 
century books on Lie theory started to work out special cases of geometric 
interest by way of illustration. In the 1920s, quantum physics also directed 
attention to SO(3), since rotations in three dimensions are physically sig- 
nificant. Still, it is remarkable that a purely geometric argument for the 
simplicity of SO(3) took so long to emerge. Perhaps its belated appear- 
ance is due to its topological content, namely, the step that depends purely 
on continuity. The argument hinges on the fact that 0 is a continuous func- 
tion of distance along the great circle PQ, and that such a function takes 
every value between its extreme values: the so-called intermediate value 
theorem. 

The theory of continuity (topology) came after the theory of continuous 
groups — not surprisingly, since one does not bother to develop a theory 
of continuity before seeing that it has some content — and applications of 
topology to group theory were rare before the 1920s. In this book we will 
present further isolated examples of continuity arguments in Sections 3.2, 
3.8, and 7.5 before taking up topology systematically in Chapter 8. 

Another book with a strongly geometric treatment of SO(3) is Berger 
[1987]. Volume I of Berger, p. 169, has a simplicity proof for SO(3) similar 
to the one given here, and it is extended to a simplicity result about SO («), 
for n > 5, on p. 170: SO(2 m + 1) is simple and the only nontrivial normal 
subgroup of SO(2 m) is {± 1 }. We arrive at the same result by a different 


2.8 Discussion 


47 


route in Section 7.5. (Our route is longer, but it also takes in the complex 
and quaternion analogues of SO («).) Berger treats SO(4) with the help 
of quaternions on p. 190 of his Volume II, much as we have done here. 
The quaternion representation of rotations of M 4 was another of Cayley’s 
discoveries, made in 1855. 

Lie observed the anomalous structure of SO(4) at the infinitesimal 
level. He mentions it, in scarcely recognizable form, on p. 683 of Volume 
III of his 1893 book Theorie der Transfoimationsgruppen. The anomaly of 
SO(4) is hidden in some modern treatments of Lie theory, where the con- 
cept of simplicity is superseded by the more general concept of semisim- 
plicity. All simple groups are semisimple, and SO(4) is semisimple, so an 
anomaly is removed by relaxing the concept of “simple” to “semisimple.” 
However, the concept of semisimplicity makes little sense before one has 
absorbed the concept of simplicity, and our goal in this book is to under- 
stand the simple groups, notwithstanding the anomaly of SO(4). 
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Preview 

In this chapter we generalize the plane and space rotation groups SO(2) 
and SO(3) to the special orthogonal group SO(n) of orientation-preserving 
isometries of M" that fix O. To deal uniformly with the concept of “rota- 
tion” in all dimensions we make use of the standard inner product on M” 
and consider the linear transformations that preserve it. 

Such transformations have determinant +1 or —1 according as they 
preserve orientation or not, so SO (n) consists of those with determinant 1. 
Those with determinant ±1 make up the full orthogonal group, O(n). 

These ideas generalize further, to the space C' 1 with inner product de- 
fined by 


(u 1 , u 2 , . . . , u n ) • ( Vl , v 2 , . . . , v„ ) = U 1 V 1 + u 2 V 2 H b U n v n . (*) 

The group of linear transformations of C" preserving (*) is called the uni- 
tary group U (n), and the subgroup of transformations with determinant 1 
is the special unitary group SLJ(«). 

There is one more generalization of the concept of isometry — to the 
space El” of ordered /i-tuples of quaternions. HP has an inner product de- 
fined like (*) (but with quaternion conjugates), and the group of linear 
transformations preserving it is called the symplectic group Sp(n). 

In the rest of the chapter we work out some easily accessible properties 
of the generalized rotation groups: their maximal tori, centers, and their 
path-connectedness . These properties later turn out to be crucial for the 
problem of identifying simple Lie groups. 
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3.1 Rotations as orthogonal transformations 

It follows from the Cartan-Dieudonne theorem of Section 2.4 that a rota- 
tion about O in M 2 or R' 1 is a linear transformation that preserves length 
and orientation. We therefore adopt this description as the definition of a 
rotation in M". However, when the transformation is given by a matrix, 
it is not easy to see directly whether it preserves length or orientation. A 
more practical criterion emerges from consideration of the standard inner 
product in M", whose geometric properties we now summarize. 

If u = (ui,U 2 ,---,u n ) and v = (vi,V 2 , . . . ,v„) are two vectors in M", 
their inner product u • v is defined by 


U • V = M ] V] + U 2 V 2 H 1- U n v n 


It follows immediately that 

U • U = UJ + «2 H h w 2 = |u| 2 , 

so the length |u| of u (that is, the distance of u from the origin 0) is defin- 
able in terms of the inner product. It also follows (as one learns in linear 
algebra courses) that u • v = 0 if and only if u and v are orthogonal, and 
more generally that 

u • v = |u||v|cos0, 

where 8 is the angle between the lines from 0 to u and 0 to v. Thus angle 
is also definable in terms of inner product. Conversely, inner product is 
definable in terms of length and angle. Moreover, an angle 8 is determined 
by cos 6 and sin 0, which are the ratios of lengths in a certain triangle, so 
inner product is in fact definable in terms of length alone. 

This means that a transformation T preserves length if and only if T 
preserves the inner product, that is, 

r(u) ■ T(v) = u v for all u,vGM". 

The inner product is a more convenient concept than length when one is 
working with linear transformations, because linear transformations are 
represented by matrices and the inner product occurs naturally within ma- 
trix multiplication: if A and B are matrices for which AB exists then 

(/, y')-element of AB = (row i of A) ■ (column j of B). 
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This observation is the key to the following concise and practical criterion 
for recognizing rotations, involving the matrix A and its transpose A T . To 
state it we introduce the notation 1 for the identity matrix, of any size, 
extending the notation used in Chapter 1 for the 2 x 2 identity matrix. 

Rotation criterion. An nx n real matrix A represents a rotation of M” if 
and only if 

AA J = 1 and det(A) = 1. 

Proof. First we show that the condition AA T = 1 is equivalent to preserva- 
tion of the inner product by A. 

AA t = 1 (row i of A) ■ (col j of A T ) = 8jj 

where <5 ;/ = 1 if i = j and 5 (/ - = 0 if if j 
(row i of A) • (row j of A) = 8/ j 
rows of A form an orthonormal basis 
O columns of A form an orthonormal basis 

because AA T = 1 means A T = A -1 , so 1 = A T A = A T (A T ) T , 
and hence A T has the same property as A 
■O- A-images of the standard basis form an orthonormal basis 
A preserves the inner product 


because Ae ; • Aej = 5/y = e, • e ; -, where ei = 
standard basis vectors of M". 




are the 


Second, the condition det(A) = 1 says that A preserves orientation, as 
mentioned at the beginning of Section 2.5. Standard properties of determi- 
nants give 


det(AA T ) = det(A) det(A T ) and det(A T ) = det(A), 


so we already have 

1 = det(l) = det(AA T ) = det(A) det(A T ) = det(A) 2 . 

And the two solutions det(A) = 1 and det(A) = — 1 occur according as A 
preserves orientation or not. □ 

A rotation matrix is called a special orthogonal matrix, presumably 
because its rows (or columns) form an orthonormal basis. The matrices 
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that preserve length, but not necessarily orientation, are called orthogonal. 
(However, orthogonal matrices are not the only matrices that preserve or- 
thogonality. Orthogonality is also preserved by the dilation matrices kl for 
any nonzero constant k.) 

Exercises 

3 . 1.1 Give an example of a matrix in 0(2) that is not in SO(2). 

3 . 1.2 Give an example of a matrix in 0(3) that is not in SO(3), and interpret it 
geometrically. 

3 . 1.3 Work out the matrix for the reflection of K 3 in the plane through O orthog- 
onal to the unit vector (a,b,c). 

3.2 The orthogonal and special orthogonal groups 

It follows from the definition of special orthogonal matrices that: 

• If A i and A 2 are orthogonal, then A 1 A} = 1 and A2AT = 1. It follows 
that the product A satisfies 

(A 1A2) (A iA2) t = A1A2A9A} because (AiA2) t = A,A{, 

= AiA\ because A2A 2 = 1, 

= 1 because A iA{ = 1. 

• If A 1 and A2 are special orthogonal, then det(Ai) = det(A2) = 1, so 

det(A[A2) = det(Ai) det(A2) = 1. 

• If A is orthogonal, then AA T = 1, hence A -1 = A T . It follows that 
(A- 1 ) 1 = (A T ) T = A, so A~ 1 is also orthogonal. And A -1 is special 
orthogonal if A is because 

det(A _1 ) =det(A) _1 = 1. 

Thus products and inverses of n x n special orthogonal matrices are special 
orthogonal, and hence they form a group. This group (the “rotation” group 
of M") is called the special orthogonal group SO (n). 

If we drop the requirement that orientation be preserved, then we get 
a larger group of transformations of M" called the orthogonal group 0 (n). 
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An example of a transformation that is in O (ri), but not in SO(n), is re- 
flection in the hyperplane orthogonal to the x\ -axis, (x\ ,X2,x ^, . . . ,x„) i— > 
(—xi,X2,X3,...,x„), which has the matrix 

f-1 0 ... 0\ 

0 1 ... 0 
• 5 

Vo 0 ... ij 

obviously of determinant —1. We notice that the determinant of a matrix 
A G O(n) is ±1 because (as mentioned in the previous section) 

AA T = 1 =>■ 1 = det(AA T ) = det(A) det(A T ) = det(A) 2 . 


Path-connectedness 

The most striking difference between SO (n) and O (n) is a topological one: 
SO(n) is path-connected and O(n) is not. That is, if we view n x n matrices 
as points of M" in the natural way — by interpreting the n matrix entries 
an, ai 2 , ■ • ■ ,ai„,a 2 i, . . . , a2 n , • • ■ ,a„i , . . . ,a nn as the coordinates of a point — 
then any two points in SO (n) may be connected by a continuous path in 
SO(n), but the same is not true of O (n). Indeed, there is no continuous 
path in O(n) from 


/ 1 


1 


\ 


to 


/-I 


1 


\ 


V i/ V i/ 


(where the entries left blank are all zero) because the value of the determi- 
nant cannot jump from 1 to —1 along a continuous path. 

The path-connectedness of SO (n) is not quite obvious, but it is inter- 
esting because it reconciles the everyday concept of “rotation” with the 
mathematical concept. In mathematics, a rotation of M" is given by speci- 
fying just one configuration, usually the final position of the basis vectors, 
in terms of their initial position. This position is expressed by a matrix 
A. In everyday speech, a “rotation” is a movement through a continuous 
sequence of positions, so it corresponds to a path in SO(n) connecting the 
initial matrix 1 to the final matrix A. 
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Thus a final position A of M' ! can be realized by a “rotation” in the 
everyday sense of the word only if SO(n) is path-connected. 

Path-connectedness of SO(n). For any n, SO(n) is path-connected. 

Proof. For n = 2 we have the circle SO(2), which is obviously path- 
connected (Figure 3.1). Now suppose that SO (n — 1) is path-connected 
and that A € SO (n). It suffices to find a path in SO(n) from 1 to A, because 
if there are paths from 1 to A and B then there is a path from A to B. 

. cos 6 + i sin 6 


1 



Figure 3.1: Path-connectedness of SO(2). 

This amounts to finding a continuous motion taking the basis vectors 
ei,e 2 , . . . ,e„ to their final positions Aei,Ae 2 , . . . ,Ae„ (the columns of A). 

The vectors ei and Aei (if distinct) define a plane so, by the path- 
connectedness of SO(2), we can move ei continuously to the position Aei 
by a rotation R of A 21 . It then suffices to continuously move Re 2 , . . . . Re„ to 
Ae 2 , . . . ,Ae„, respectively, keeping Aei fixed. Notice that 

• Re 2 , . . . . Re n are all orthogonal to Re] = Aei, because e 2 , . . . ,e n are 
all orthogonal to ei and R preserves angles. 

• Ae 2 , . . . ,Ae„ are all orthogonal to Aei, because e 2 , . . . ,e„ are all or- 
thogonal to ei and A preserves angles. 

Thus the required motion can take place in the R" 1 of vectors orthogonal 
to Aei , where it exists by the assumption that SO (n — 1) is path-connected. 

Performing the two motions in succession — taking ei to Aei and then 
Re 2 , . . ■ ,Re n to Ae 2 , . . . ,Ae„ — gives a path from 1 to A in SO(w). □ 

The idea of path-connectedness will be explored further in Sections 3.8 
and 8.6. In the meantime, the idea of continuous path is used informally in 
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the exercises below to show that path-connectedness has interesting alge- 
braic implications. 

Exercises 

The following exercises study the identity component in a matrix group G, that is, 
the set of matrices A £ G for which there is a continuous path from 1 to A that lies 
inside G. 

3.2.1 Bearing in mind that matrix multiplication is a continuous operation, show 
that if there are continuous paths in G from 1 to A £ G and to B £ G then 
there is a continuous path in G from A to AB. 

3.2.2 Similarly, show that if there is a continuous path in G from 1 to A, then 
there is also a continuous path from A -1 to 1. 

3.2.3 Deduce from Exercises 3.2.1 and 3.2.2 that the identity component of G is 
a subgroup of G. 


3.3 The unitary groups 

The unitary groups U(n) and SU(n) are the analogues of the orthogonal 
groups O (n) and SO (« J for the complex vector space C”, which consists 
of the ordered n-tuples (zi,z 2 ,.. . ,z n ) of complex numbers. The sum oper- 
ation on C" is the usual vector addition: 


(mi ,K 2 , . . • , U n ) + (vi , V 2 , . . . , v„) = (u\ + Vl , U 2 + V 2 , ■ ■ ■ , U n + V„). 

And the multiple of (zi,z 2) . . . ,z„) £ C" by a scalar c £ C is naturally 
(cz\ ■C7,1 t ■ • ■ cz , n ) • The twist comes with the inner product, because we 
would like the inner product of a vector v with itself to be a real number — 
the squared distance |v| 2 from the zero matrix 0 to v. We ensure this by 
the definition 


(ui,U 2 ,...,U n ) ■ (V 1 ,V 2 ,...,V„) = KlVl +M 2 V2-1 \~U n V n . (*) 

With this definition of u • v we have 

vv = vivT+v 2 v 2 4 l-v„v^= |vi | 2 + |v 2 | 2 H h|v„| 2 = |v| 2 , 

and |v| 2 is indeed the squared distance of v = (vi, v 2 , ■ . . ,v„) from 0 in the 
space M 2 ' 1 that equals C" when we interpret each copy of C as M 2 . 


3.3 The unitary groups 
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The kind of inner product defined by (*) is called Hermitian (after the 
nineteenth-century French mathematician Charles Hermite). Just as one 
meets ordinary inner products of rows when forming the product 

AA t , for a real matrix A , 

so too one meets the Flermitian inner product (*) of rows when forming the 
product 

— T 

AA , for a complex matrix A. 

Flere A denotes the result of replacing each entry a\j of A by its complex 
conjugate aJJ. 

With this adjustment the arguments of Section 3. 1 go through, and one 
obtains the following theorem. 

Criterion for preserving the inner product on C". A linear transforma- 
tion ofC ” preserves the inner product (*) if and only if its matrix A satisfies 
— T 

AA = 1, where 1 is the identity matrix. □ 

As in Section 3.1, one finds that the rows (or columns) of A form an 
orthonormal basis of C". The rows Vj are “normal” in the sense that |vi| = 
1, and “orthogonal” in the sense that v,- ■ v 7 = 0 when i f j, where the dot 
denotes the inner product (*). 

It is clear that if linear transformations preserve the inner product (*) 
then their product and inverses also preserve (*), so the set of all transfor- 
mations preserving (*) is a group. This group is called the unitary group 
U (n). The determinant of an A in U (n) has absolute value 1 because 

AA T = 1 =>• 1 = det(AA T ) = det(A)det(A T ) = det(A)det(A) = |det(A)| 2 , 

and it is easy to see that det(A) can be any number with absolute value 1. 

The subgroup of U(«) whose members have determinant 1 is called the 
special unitary group SU(rc). 

We have already met one SU(n), because the group of unit quaternions 

where a, [5 € C and |a| 2 + |/3j 2 = 1, 

is none other than SU(2). The rows {a. /j ) and (ffa) are easily seen 
to form an orthonormal basis of C 2 . Conversely, (a,— /3) is an arbitrary 
unit vector in C 2 , and (j8,a) is the unique unit vector orthogonal to it that 
makes the determinant equal to 1 . 


a -/3 
/3 a 
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Path-connectedness of SU(n) 

We can prove that SU(n) is path-connected, along similar lines to the proof 
for SO (n) in the previous section. The proof is again by induction on n, 
but the case n = 2 now demands a little more thought. It is helpful to use 
the complex exponential function e' x , which we take to equal cosx + /sinx 
by definition for now. (In Chapter 4 we study exponentiation in depth.) 

Given J! ^ in SU(2), first note that ( a . /3 ) is a unit vector in C 2 , 
so a = ucos 0 and /3 = vsin0 for some u,v in C with \u\ = |v| = 1. This 
means that u = e 1 ^ and v = e' ,,/ for some (j), y/ G M. 

It follows that 

a(t) = e'^cosdf, f3(t) = e'^sinQt, for 0<f<l, 

gives a continuous path from 1 to ^ J? j in SU(2). Thus 

SU(2) is path-connected. 

Exercises 

Actually, SU (2) is not the only special unitary group we have already met, though 
the other one is less interesting. 

3.3.1 What is SU(l)? 

The following exercises verify that a linear transformation of C", with matrix 

— T 

A, preserves the Hermitian inner product (*) if and only if AA = 1. They can be 
proved by imitating the corresponding steps of the proof in Section 3.1. 

3.3.2 Show that vectors form an orthonormal basis of C" if and only if their 
conjugates form an orthonormal basis, where the conjugate of a vector 
(mi, M2, ■ • • ,m„) is the vector (t?7,M2, ■ ■ ■ ,Mjj). 

— T 

3.3.3 Show that AA = 1 if and only if the row vectors of A form an orthonormal 
basis of C". 

3.3.4 Deduce from Exercises 3.3.2 and 3.3.3 that the column vectors of A form 
an orthonormal basis. 

3.3.5 Show that if A preserves the inner product (*) then the columns of A form 
an orthonormal basis. 

3.3.6 Show, conversely, that if the columns of A form an orthonormal basis, then 
A preserves the inner product (*). 


3.4 The symplectic groups 
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3.4 The symplectic groups 

On the space H" of ordered //-tuples of quaternions there is a natural inner 
product, 


(PUP2, ■ • • ,Pn) ■ (gt,<?2, ■ • • ,q n ) = Pm + Piqi h PnQn- (**) 

This of course is formally the same as the inner product (*) on C", ex- 
cept that the p, and qj now denote arbitrary quaternions. The space IHI" 
is not a vector space over H, because the quaternions do not act correctly 
as “scalars”: multiplying a vector on the left by a quaternion is in general 
different from multiplying it on the right, because of the noncommutative 
nature of the quaternion product. 

Nevertheless, quaternion matrices make sense (thanks to the associa- 
tivity of the quaternion product, we still get an associative matrix product), 
and we can use them to define linear transformations of HP. Then, by spe- 
cializing to the transformations that preserve the inner product (**), we get 
an analogue of the orthogonal group for H' 7 called the symplectic group 
Sp(/i). As with the unitary groups, preserving the inner product implies 
preserving length in the corresponding real space, in this case in the space 
M 4 ' 7 corresponding to H". 

For example, Sp(l) consists of the 1 x 1 quaternion matrices, multipli- 
cation by which preserves length in H = M 4 . In other words, the members 
of Sp(l) are simply the unit quaternions. Because we defined quaternions 
in Section 1.3 as the 2 x 2 complex matrices 

fa + id —b — ic\ 

\b — ic a — id J ' 


it follows that 

**>={(»-£ ^) : ° 2+ ^ 2+ ^‘H u ( 2 ,- 

Thus we have already met the first symplectic group. 

The quaternion matrices A in Sp(7z), like the complex matrices in 

x 

SU(/i), are characterized by the condition AA = 1 , where the bar now 
denotes the quaternion conjugate. The proof is the same as for SU(zi). 
Because of this formal similarity, there is a proof that Sp(n) is path- 
connected, similar to that for SU(«) given in the previous section. 
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However, we avoid imposing the condition det(A) = 1, because there 
are difficulties in the very definition of determinant for quaternion matrices. 
We sidestep this problem by interpreting all n x n quaternion matrices as 
2 n x 2n complex matrices. 


The complex form of Spin) 

In Section 1.3 we defined quaternions as the complex 2x2 matrices 


/ a + id —b — ic 
\b — ic a — id 




for a,/3 £ C. 


Thus the entries of a quaternion matrix are themselves 2x2 matrices q. 
Thanks to a nice feature of the matrix product — that it admits block multi- 
plication — we can omit the parentheses of each matrix q. Then it is natural 
to define the complex form, C(A), of a quaternion matrix A to be the result 
of replacing each quaternion entry q in A by the 2x2 block 


a -J3 
J3 a 


Notice also that the transposed complex conjugate of this block corre- 
sponds to the quaternion conjugate of q: 

_ / a — id b + ic\ ( oc /3\ 

^ a + id) \~P oc) ' 

— T 

Therefore, if A is a quaternion matrix such that AA = 1, it follows by 
block multiplication (and writing 1 for any identity matrix) that 

C(A)C(A) T = C(AA T ) = C(l) = 1. 

Thus C(A) is a unitary matrix. 

Conversely, if A is a quaternion matrix for which C(A) is unitary, then 
— T — T 

AA = 1. This follows by viewing the product AA of quaternion matrices 
T 

as the product C(A)C(A) of complex matrices. Therefore, the group Sp(«) 
consists of those n X n quaternion matrices A for which C(A) is unitary. 

It follows, if we define the complex form of Sp(n) to be the group of 
matrices C(A) for A £ Sp(n), that the complex form ofSp(n) consists of the 
unitary matrices of the form C(A), where A is an n x n quaternion matrix. 
In particular, the complex form of Sp(n) is a subgroup of U(2«). 


3.4 The symplectic groups 
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Many books on Lie theory avoid the use of quaternions, and define 
Sp(n) as the group of unitary matrices of the form C{A ). This gets around 
the inconvenience that HP 1 is not quite a vector space over H (mentioned 
above) but it breaks the simple thread joining the orthogonal, unitary, and 
symplectic groups: they are the “generalized rotation” groups of the spaces 
with coordinates from M, C, and H, respectively. 


Exercises 

It is easy to test whether a matrix consists of blocks of the form 

a -[5 

/3 a 

Nevertheless, it is sometimes convenient to describe the property of “being of the 
form C(A)” more algebraically. One way to do this is with the help of the special 
matrix 

0 

,-i or 


j= 


3.4.1 If B = 


a —B\ * 

_ show that JBJ = B. 


a 


3.4.2 Conversely, show that if JBJ 1 = B and B 


'a — B 

and d = —e, so B has the form I 77 _ 

' n a 


c d 

e / 


then we have c = f 


Now suppose that B 2 ,, is any 2 n x 2 n complex matrix, and let 



( J 

0 

0 


0 ^ 



0 

j 

0 


0 


Jin — 






, where 0 is the 2 x 2 zero matrix 


V 0 

0 


0 

J ) 



3.4.3 Use block multiplication, and the results of Exercises 3.4.1 and 3.4.2, to 
show that Bin has the form C(A) if and only if J^iBin-l^n = ^ 2 «• 

The equation satisfied by J and B 2 ,, enables us to derive information about detf/L/,) 

(thus getting around the problem with the determinant of a quaternion matrix). 

3.4.4 By taking det of both sides of the equation in Exercise 3.4.3, show that 
det(Z?2«) is rea h 

3.4.5 Assuming now that Z? 2 n is in the complex form of Sp (n), and hence is uni- 
tary, show that det(B 2 n) = ± 1 . 
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One can prove Sp (n) is path-connected by an argument like that used for 
SU(n) in the previous section. First prove path-connectedness of Sp(2) as for 
SU(2), using a result from Section 4.2 that each unit quaternion is the exponential 
of a pure imaginary quaternion. 

3.4.6 Deduce from the path-connectedness of Sp(n) that det(B 2 «) = 1. 

This is why there is no “special symplectic group” — the matrices in the symplectic 
group already have determinant 1 , under a sensible interpretation of determinant. 

3.5 Maximal tori and centers 

The main key to understanding the structure of a Lie group G is its maximal 
torus, a (not generally unique) maximal subgroup isomorphic to 

T /l ' = § 1 x § 1 x ■ ■ • x S 1 (A: -fold Cartesian product) 

contained in G. The group T / is called a torus because it generalizes the 
ordinary torus T 2 = S 1 x S 1 . An obvious example is the group SO(2) = S 1 , 
which is its own maximal torus. For the other groups SO(n), not to mention 
SU(n) and Sp(n), maximal tori are not so obvious, though we will find 
them by elementary means in the next section. To illustrate the kind of 
argument involved we first look at the case of SO(3). 

Maximal torus of SO(3) 

If we view SO(3) as the rotation group of M 3 , and let ei, e 2 , and e 3 be the 
standard basis vectors, then the matrices 




cos 9 — sin 9 0 
sin 9 cos 9 0 

0 0 1 


form an obvious T 1 = S 1 in SO(3). The matrices R' e are simply rotations 
of the (ei,e 2 )-plane through angle 9, which leave the e 3 -axis fixed. 

If T is any torus in G that contains this T 1 then, since any torus is 
abelian, any A £ T commutes with all R' e £ T 1 . We will show that if 


AR'q = R'qA for all R' e £ T 


then A £ T 1 , so T = T 1 and hence T 1 is maximal. It suffices to show that 
A(ei),A(e 2 ) £ (ei,e 2 ) -plane, 


3.5 Maximal tori and centers 
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because in that case A is an isometry of the (ei ,e 2 )-plane that fixes O. The 
only such isometries are rotations and reflections, and the only ones that 
commute with all rotations are rotations themselves. 

So, suppose that 

A(ei) = cpe [ -f- ci 2 e 2 £z 3 e 3 - 

By the hypothesis (*), A commutes with all R' e , and in particular with 

/-I 0 0 

K= 0-10 
\ 0 01 

Now we have 

A?4(ei) =A(— ei) = -«iei -a 2 e 2 -a 3 e 3 , 

^A(ei) =R’ Jl (aie l + a 2 e 2 + a 3 e 3 ) = -ajei -a 2 e 2 + a 3 e 3 , 

so it follows from AR' n = R' n A that a 3 = 0 and hence 

A(ei) G (ei,e 2 ) -plane. 

A similar argument shows that 

A(e 2 ) G (ei,e 2 ) -plane, 

which completes the proof that T 1 is maximal in SO(3). □ 

An important substructure of G revealed by the maximal torus is the 
center of G, a subgroup defined by 

Z(G) = {A G G : AB = BA for all B G G}. 

(The letter Z stands for “Zentrum,” the German word for “center.”) It is 
easy to check that Z(G) is closed under products and inverses, and hence 
Z(G) is a group. We can illustrate how the maximal torus reveals the center 
with the example of S 0(3) again. 

Center ofSO(3) 

An element A G Z(SO(3)) commutes with all elements of SO(3), and in 
particular with all elements of the maximal torus T 1 . The argument above 
then shows that A fixes the basis vector e 3 . Interchanging basis vectors, we 
likewise find that A fixes ei and e 2 . Hence A is the identity rotation 1. 

Thus Z(SO(3)) = {!}. □ 



62 


3 Generalized rotation groups 


Exercises 

The 2-to-l map from SU(2) to SO(3) ensures that the maximal toms and center 

of SU(2) are similar to those of SO(3). 

3.5.1 Give an example of a T 1 in SU(2). 

3.5.2 Explain why a T 2 in SU(2) yields a T 2 in SO(3), so T 1 is maximal in 
SU(2). (Hint: Map each element g of the T 2 in SU(2) to the pair ±g in 
SO(3), and look at the images of the S 1 factors of T 2 .) 

3.5.3 Explain why Z(SU(2)) = {±1}. 

The center of SO(3) can also be found by a direct geometric argument. 

3.5.4 Suppose that A is a rotation of R 3 , about the ei-axis, say, that is not the 
identity and not a half-turn. Explain (preferably with pictures) why A does 
not commute with the half-turn about the e 3 -axis. 

3.5.5 If A is a half-turn of R 3 about the ei-axis, find a rotation that does not 
commute with A. 

In Section 3.7 we will show that Z(SO(2m+ 1)) = {1} for all m. However, 

the situation is different for SO(2m). 

3.5.6 Give an example of a nonidentity element of Z(SO(2m)) for each m > 2. 


3.6 Maximal tori in SO(«), U(/i), SU(/i), Sp(/i) 


The one-dimensional torus T 1 = S 1 appears as a matrix group in several 
different guises: 

• as a group of 2 x 2 real matrices 



cos 9 — sin 9 
sin 9 cos 9 



• as a group of complex numbers (or lxl complex matrices) 


ze = cos0 + /sin0, 


• as a group of quaternions (or lxl quaternion matrices) 


qe = cos 9 + isin0. 


Each of these incarnations of T 1 gives rise to a different incarnation of T / ' : 


3.6 Maximal tori in SO(n), U(n), SU(n), Sp(n) 
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• as a group of 2k x 2k real matrices 


^ 01,02 , — ,0k ~~ 

/ cos 0i — sin 0i \ 

sin 0i cos0] 

cos 02 — sin 02 

sin 02 cos 02 


V 


cos Ok — sin Ok 
sin Ok cos Ok ) 


where all the blank entries are zero, 


• as a group of k x k unitary matrices 


Zo lt 02,...,o k 



V 


\ 

e Wk j 


where all the blank entries are zero and e' e = cos 0 + i sin 0, 


• as a group of k x k symplectic matrices 


(40 1 


Qe i,02,... A = 


(402 



where all the blank entries are zero and e 1 9 = cos0 + isin0. (This 
generalization of the exponential function is justified in the next 
chapter. In the meantime, e'° may be taken as an abbreviation for 
cos0 + isin0.) 


We can also represent T k by larger matrices obtained by “padding” the 
above matrices with an extra row and column, both consisting of zeros 
except for a 1 at the bottom right-hand corner (as we did to produce the 
matrices R' g in SO(3) in the previous section). Using this idea, we find the 
following tori in the groups SO(2 m), SO(2m+ 1), U(n), SU(n), and Sp(n). 
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In SO(2 m) we have the T m consisting of the matrices i?g,.g 2 g m . In 

SO(2 m + 1) we have the T m consisting of the “padded” matrices 

R 'd-d o = 

/ cos 0] — sin 0i \ 

sin 0i cos 0] 

cos 02 — sin 02 

sin 02 cos 02 


V 


cos Ok — sin Ok 
sin Ok cos Ok 

1 / 


In U(n) we have the T" consisting of the matrices ^ In SU(n) we 

have the T" -1 consisting of the Zg,^...^ with 0i + 02 H h 0„ = 0. The 

latter matrices form a T"~ 1 because 



/e‘ e ' 

\ 


/ e Ko i-e„) 


\ 


\ 

giQ n — i 

e‘ e "j 

= e w " 

V 

1 ) 

i/ 

and the matrices on the right clearly form al” 1 . 
have the T" consisting of the matrices Qg li g 2) ... i g„. 

Finally, in 

Sp (n 


we 


We now show that these “obvious” tori are maximal. As with SO(3), 
used as an illustration in the previous section, the proof in each case con- 
siders a matrix A G G that commutes with each member of the given torus 
T, and shows that AST. 


Maximal tori in generalized rotation groups. The tori listed above are 
maximal in the corresponding groups. 

Proof. Case (1): T m in SO(2m), for m > 2. 

If we let ei ,e2 , . . . ,e2 m denote the standard basis vectors for M 2 '", then 

the typical member Re { .e 2 g m of T m is the product of the following plane 

rotations, each of which fixes the basis vectors orthogonal to the plane: 

rotation of the (ei ,e2) -plane through angle 0 i, 
rotation of the (e3,e4) -plane through angle 02, 


rotation of the (e2 m _i,e2 m ) -plane through angle 0 m . 
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Now suppose that A E SO(2 m) commutes with each Rg u g, g m . We are 

going to show that 

A(ei),A(e 2 ) E (ei,e 2 ) -plane, 

A(e 3 ),A(e 4 ) E (e 3 ,e 4 ) -plane, 

A(e 2 m _i),A(e 2m ) E (e 2 m _i,e 2 m )-plane, 

from which it follows that A is a product of rotations of these planes, and 
hence is a member of T m . (The possibility that A reflects some plane 3* is 
ruled out by the fact that A commutes with all members of T m , including 
those that rotate only the plane . Then it follows as in the case of SO(3) 
that A rotates ^.) 

To show that A maps the basis vectors into the planes claimed, it suf- 
fices to show that A(ei) E (ei ,e 2 ) -plane, since the other cases are similar. 
So, suppose that 

ARe 1 ,e 2 ,-,e m = Re u & 2 ,-,e m A for all f? 0 1 ,e 2r ..,e m € T, 

and in particular that 

-IA’.t.i) oA“i ) = Rjc,o,...,oA(ei). 

Then if A (e i ) = a i e i + a 2 e 2 H 1- a 2m e 2m . we have 

AR n , o o(ei) = A(— ■ ei) = — a\t\ — a 2 e 2 — a 3 e 3 a 2 m e 2m , 

but 

Rn,o,...,oA(ei) = — fliei - a 2 e 2 + a 3 e 3 H he 2 m e 2m , 

whence a 3 = n 4 = ■ ■ • = a 2m = 0, as required. 

The argument is similar for any other e*. Hence A E T m , as claimed. 
Case (2): T m in SO(2m+ 1). 

In this case we generalize the argument for SO(3) from the previous 
section, using maps such as R' n 0 0 in place of R' n . 

Case (3): T" in U(n). 

Let ej ,e 2 , . . . ,e„ be the standard basis vectors of C", and suppose that 
A commutes with each element g, g n of T". In particular, A commutes 


with 

f - 1 

\ 

Z;r,0....,0 = 

1 



V 


V 
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Then if A(ei) = ci\t\ H b a n e„ we have 


AZ K: o,...,o(ei) — A(— i ei) — — fliei a„e„ 


2w,o,...,o^( e i) — H b a n e n ) — — flieH b a n e n , 

whence it follows that ai = • • • = a n = 0. 

Thus A(ej) = cjei for some c\ E C, and a similar argument shows that 
A(ei) = for each k. Also, A(ei), . . .A(e„) are an orthonormal basis, 
since A E U(n). Hence each |c*| = 1, so c* = e ,(pk and therefore A E T”. 

Case (4): T n_1 in SU(n). 

For n > 2 we can argue as for U(n), except that we need to commute 

A with both Z nn Q 0 and Z k ,o. 7 t,...,o to conclude that A(ei) = cjej. This is 

because Z^ op o is not in SU(n), since it has determinant —1. 

For n = 2 we can argue as follows. 

Suppose A = (" ^) commutes with each Zq-q E T 1 . In particular, A 
commutes with 



which implies that 



It follows that b = c = 0 and hence A E T 1 . 

Case (5): T" in Sp(n). 

Here we can argue exactly as in Case (3). 


□ 


Exercises 

3 . 6.1 Viewing C" as R 2 ", show that Zg t g, g n is the same isometry as Rg t g, g ;i . 

3 . 6.2 Use Exercise 3.6.1 to give another proof that T" is a maximal toms of U(n). 

3 . 6.3 Show that the maximal tori found above are in fact maximal abelian sub- 
groups of SO(n), U(n), SU(n), Sp(«). 

We did not look for a maximal torus in O (n) because the subgroup SO (n) is of 
more interest to us, but in any case it easy to find a maximal torus in O («). 

3 . 6.4 Explain why a maximal torus of O(n) is also a maximal torus of SO(n). 
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3.7 Centers of SO(«), U (n), SU(«), Sp(«) 


The arguments in the previous section show that an element A in G = 
SO(n),U(n),SU(n),Sp(n) that commutes with all elements of a maximal 
torus T in G is in fact in T. It follows that if A commutes with all elements 
of G then A E T. Thus we can assume that elements A of the center Z(G) 
of G have the special form known for members of T. This enables us to 
identify Z(G) fairly easily when G = SO(n),U(n),SU(n),Sp(n). 

Centers of generalized rotation groups. The centers of these groups are: 

(1) Z(SO(2m)) = {±1}. 

(2) Z(SO(2m+ 1)) = {1}. 

(3) Z(U(n)) = {ml : |m| = 1}. 

(4) Z(SU(n)) = {ml : m" = 1}. 

(5) Z(Sp(n)) = {±1}. 

Proof. Case (1): A E Z(SO(2m)) for m > 2. 

In this case A = Re u ei,...,e n for some angles 0i , 02, • ■ • , 0 n , and A com- 
mutes with all members of SO(2 m). Now Ro l e 2 e n is built from a se- 
quence of 2 x 2 blocks (placed along the diagonal) of the form 



cos 6 — sin 9 

sin 0 cos 0 


We notice that Rq does not commute with the matrix 



unless sin0 = 0 and hence cos Q = ±1. Therefore, if we build a matrix 
7| m E SO(2 m) with copies of I* on the diagonal, Re u e 2 ....fi n will commute 
with 7| m only if each sin 0/. = 0 and cos 0/. = + I . 

Thus a matrix A in Z(SO(2m)) has diagonal entries ± 1 and zeros else- 
where. Moreover, if both +1 and — 1 occur we can find a matrix in SO(2m) 
that does not commute with A; namely, a matrix with Rq on the diagonal at 
the position of an adjacent +1 and —1 in A, and otherwise only l’s on the 
diagonal. So, in fact, A = 1 or A = —1. Both 1 and —1 belong to SO(2 m), 
and they obviously commute with everything, so Z(SO(2m)) = {± 1 }. 
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Case (2): A <E Z(SO(2m+ 1)). 

The argument is very similar to that for Case (1), except for the last 
step. The (2 m+ 1) x (2m + 1) matrix —1 does not belong to SO (2m + 1), 
because its determinant equals —1. Hence Z(SO(2m + 1)) = {1}. 

Case (3): A € Z(U(n)). 

In this case A = Zq u q 2 $ n for some 61 , 62 ,... , 6 n and A commutes with 
all elements of U(n). If n = 1 then U(n) is isomorphic to the abelian group 
S 1 = {e ,e : 6 S M}, so U(l) is its own center. If n > 2 we take advantage 
of the fact that 

does not commute with 

unless e ,Sl = e l(>: . It follows, by building a matrix with ( ( } ) somewhere 
on the diagonal and otherwise only Is on the diagonal, that A = Zq u q 2 g n 
must have e ,dl = e ,dl = • • • = e ,e ". 

In other words, elements of Z(U(ra)) have the form e ,e l. Conversely, 
all matrices of this form are in U(n), and they commute with all other 
matrices. Hence 

Z(U(n)) = {e w l :0Gl} = {ffll: |o)| = 1}. 

Case (4): A e Z(SU(n)). 

The argument for U(n) shows that A must have the form col, where 
|co| = 1. But in SU(n) we must also have 

1 = det(A) = co”. 

This means that co is one of the n “roots of unity” 

2/ tt/h 4 in/n 2(n-\)n/n , 

All such matrices col clearly belong to SU(n) and commute with every- 
thing, hence Z(SU(n)) = {ft»l : co" = 1}. 

Case (5): A G Z(Sp(n)). 

In this case A = <2ei, e„ for some 61 , 62 , ... , 6 „ and A commutes 
with all elements of Sp(n). The argument used for U (n) applies, up to the 
point of showing that all matrices in Z(Sp(n)) have the form ql, where 
|g| = 1. But now we must bear in mind that quaternions q do not generally 
commute. Indeed, only the real quaternions commute with all the others, 
and the only real quaternions q with \q\ = I arc q = I and q = — 1. Thus 

Z(Sp(n)) = {±l}. 




□ 


3.8 Connectedness and discreteness 
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Exercises 

It happens that the quotient of each of the groups SO (n), U(n), SU(«), Sp(n) by 
its center is a group with trivial center (see Exercise 3.8.1). However, it is not 
generally true that the quotient of a group by its center has trivial center. 

3.7.1 Find the center Z(G) of G = {1, — l,i, — i, j , — j,k, — k} and hence show that 
G/Z(G) has nontrivial center. 

3.7.2 Prove that U(n) /Z(U(n)) = SU(n)/Z(SU(n)). 

3.7.3 Is SU(2)/Z(SU(2)) = SO(3)? 

3.7.4 Using the relationship between U(n), Z(U(n)), and SU(n), or otherwise, 
show that U(«) is path-connected. 

3.8 Connectedness and discreteness 

Finding the centers of SO(n), U(n), SU(n), and Sp(n) is an important step 
towards understanding which of these groups are simple. The center of 
any group G is a normal subgroup of G, hence G cannot be simple unless 
Z(G) = {1}. This rules out all of the groups above except the SO(2m+ 1). 
Deciding whether there are any other normal subgroups of SO(2m+ 1) 
hinges on the distinction between discrete and nondiscrete subgroups. 

A subgroup H of a matrix Lie group G is called discrete if there is a 
positive lower bound to the distance between any two members of H, the 
distance between matrices (a if) and (bjj) being defined as 



(We say more about the distance between matrices in the next chapter.) In 
particular', any finite subgroup of G is discrete, so the centers of SO(n), 
SU(n), and Sp(n) are discrete. On the other hand, the center of Gin) is 
clearly not discrete, because it includes elements arbitrarily close to the 
identity matrix. 

In finding the centers of SO(n), SLJ(n), and Sp(n) we have in fact found 
all their discrete normal subgroups, because of the following remarkable 
theorem, due to Schreier [1925]. 

Centrality of discrete normal subgroups. IfG is a path-connected matrix 
Lie group with a discrete normal subgroup H, then H is contained in the 
center Z(G) ofG. 
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Proof. Since H is normal, BA B 1 G H for each AG// and B G G. Thus 
B — ' BAB 1 defines a continuous map from G into the discrete set H. Since 
G is path connected, and a continuous map sends paths to paths, the image 
of the map must be a single point of H. This point is necessarily A because 
1 1A1” 1 = A. 

In other words, each AG// has the property that BA = AB for all B G G. 
That is ,AGZ(G). □ 

The groups SO(n), SU(n), and Sp(n) are path-connected, as we have 
seen in Sections 3.2, 3.3, and 3.4, so all their discrete normal subgroups 
are in their centers, determined in Section 3.7. In particular - , SO(2m+ 1) 
has no nontrivial discrete normal subgroup, because its center is {1}. 

It follows that the only normal subgroups we may have missed in 
SO(ra), SU(n), and Sp(n) are those that are not discrete. In Section 7.5 
we will establish that such subgroups do not exist, so all normal sub- 
groups of SO(n), SU(n), and Sp(n) are in their centers. In particular, 
the groups SO (2m + 1) are all simple, and it follows from Exercise 3.8.1 
below that the rest are simple “modulo their centers.” That is, for G = 
SO(2m),SU(n),Sp(n), the group G/Z(G) is simple. 

Exercises 

3.8.1 If Z(G) is the only nontrivial normal subgroup of G, show that G/Z(G) is 
simple. 

The result of Exercises 3.2.1, 3.2.2, 3.2.3 can be improved, with the help of 
some ideas used above, to show that the identity component is a normal subgroup 
of G. 

3.8.2 Show that, if H is a subgroup of G and AHA^ 1 C H for each A G G, then H 
is a normal subgroup of G. 

3.8.3 If G is a matrix group with identity component H, show that AHA 1 C H 
for each matrix A G G. 

The proof of Schreier’s theorem assumes only that there is no path in H be- 
tween two distinct members, that is, H is totally disconnected. Thus we have 
actually proved: if G is a path-connected group with a totally disconnected nor- 
mal subgroup H, then H is contained in Z(G). We can give examples of totally 
disconnected subgroups that are not discrete. 

3.8.4 Show that the subgroup H = {coslnr + isinlnr : r rational} of the circle 
SO(2) is totally disconnected but dense, that is, each arc of the circle con- 
tains an element of H. 


3.9 Discussion 
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This example is also a normal subgroup. However, normal, dense, totally 
disconnected subgroups are rare. 

3.8.5 Explain why there is no normal, dense, totally disconnected subgroup of 
SO(n) for n > 2. 


3.9 Discussion 


The idea of treating orthogonal, unitary, and symplectic groups uniformly 
as generalized isometry groups of the spaces M”, C", and H” seems to 
be due to Chevalley [1946]. Before the appearance of Chevalley’s book, 
the symplectic group Sp(n) was generally viewed as the group of unitary 
transformations of C 2 " that preserve the symplectic form 

(at a( - J8t J8{) + ■ ■ ■ + («X - j3„A'A 

where (oq , /3i , . . . ,a„, /3„) is the typical element of C 2 ". This element cor- 
responds to the element (q i , . . . , q„) of H", where 


qk = 




The invariance of the quaternion inner product 

q\4\ H Vq n q! n 


is therefore equivalent to the invariance of the matrix product 

(a i -M K -_P \ -AA ( < -_P_n\ 

\Pl OT ) VA a 'l) \Pn On ) \Pn < ) ’ 

which turns out to be equivalent to the invariance of the symplectic form. 
The word “symplectic” itself was introduced by Hermann Weyl in his book 
The Classical Groups, Weyl [1939], p. 165: 

The name “complex group” formerly advocated by me in al- 
lusion to line complexes, as these are defined by the vanishing 
of antisymmetric bilinear forms, has become more and more 
embarrassing through collision with the word “complex” in 
the connotation of complex number. I therefore propose to re- 
place it with the corresponding Greek adjective “symplectic.” 
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Maximal tori were also introduced by Weyl, in his paper Weyl [1925]. 
In this book we use them only to find the centers of the orthogonal, unitary, 
and symplectic groups, since the centers turn out to be crucial in the inves- 
tigation of simplicity. However, maximal tori themselves are important for 
many investigations in the structure of Lie groups. 

The existence of a nontrivial center in SO(2 m), SU(n), and Sp(n) 
shows that these groups are not simple, since the center is obviously a 
normal subgroup. Nevertheless, these groups are almost simple, because 
the center is in each case their largest normal subgroup. We have shown in 
Section 3.8 that the center is the largest normal subgroup that is discrete, 
in the sense that there is a minimum, nonzero, distance between any two 
of its elements. It therefore remains to show that there are no nondiscrete 
normal subgroups, which we do in Section 7.5. 

It turns out that the quotient groups of SO(2nz), SU(n), and Sp(n) by 
their centers are simple and, from the Lie theory viewpoint, taking these 
quotients makes very little difference. The center is essentially “invisible,” 
because its tangent space is zero. We explain “invisibility” in Chapter 5, 
after looking at the tangent spaces of some particular groups in Chapter 4. 

It should be mentioned, however, that the quotient of a matrix group 
by a normal subgroup is not necessarily a matrix group. Thus in taking 
quotients we may leave the world of matrix groups. The first example was 
discovered by Birkhoff [1936]. It is the quotient (called the Heisenberg 
group ) of the group of upper triangular matrices of the form 



by the subgroup of matrices of the form 



The Heisenberg group is a Lie group, but not isomoiphic to a matrix group. 

One of the reasons for looking at tangent spaces is that we do not have 
to leave the world of matrices. A theorem of Ado from 1936 shows that 
the tangent space of any Lie group G — the Lie algebra g — can be faithfully 
represented by a space of matrices. And if G is almost simple then g is truly 
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simple, in a sense that will be explained in Chapter 6. Thus the study of 
simplicity is, well, simplified by passing from Lie groups to Lie algebras. 

The importance of topology in Lie theory — and particularly paths and 
connectedness — was first realized by Schreier in 1925. Schreier published 
his results in the journal of the Hamburg mathematical seminal - — a well- 
known journal for algebra and topology at the time — but they were not 
noticed by Lie theorists until after Schreier’s untimely death in 1929 at the 
age of 28. In 1929, Elie Cartan became aware of Schreier’s results and 
picked up the torch of topology in Lie theory. 

In the 1930s, Cartan proved several remarkable results on the topol- 
ogy of Lie groups. One of them has the consequence that S 1 and § 3 are 
the only spheres that admit a continuous group structure. Thus the Lie 
groups SO(2) and SU(2), which we already know to be spheres, are the 
only spheres that actually occur among Lie groups. Cartan’s proof uses 
quite sophisticated topology, but his result is related to the theorem of 
Frobenius mentioned in Section 1.6, that the only skew fields M" are M, 
M 2 = C, and M 4 = EL In particular, there is a continuous and associa- 
tive “multiplication” — necessary for continuous group structure — only in 
M, M 2 , and M 4 . For more on the interplay between topology and algebra in 
M", see the book Ebbinghaus et al. [1990]. 


4 


The exponential map 


Preview 

The group S 1 = S0(2) studied in Chapter 1 can be viewed as the image of 
the line Ri = {id : 6 £l} under the exponential function, because 

exp(/0) = e‘ e = cos 0 + /sin 0 . 

This line is (in a sense we explain below) the tangent to the circle at its 
identity element 1 . And, in fact, any Lie group has a linear space (of the 
same dimension as the group) as its tangent space at the identity. 

The group § 3 = SU(2) is also the image, under a generalized exp func- 
tion, of a linear space. This linear space — the tangent space of SU(2) at 
the identity — is three-dimensional and has an interesting algebraic struc- 
ture. Its points can be added (as vectors) and also multiplied in a way that 
reflects the nontrivial conjugation operation g\,g 2 ► gigigf 1 in SU(2). 

The algebra su(2) on the tangent space is called the Lie algebra of the Lie 
group SU(2), and it is none other than M 3 with the vector product. 

As we know from Chapter 1, complex numbers and quaternions can 
both be viewed as matrices. The exponential function exp generalizes to 
arbitrary square matrices, and we will see later that it maps the tangent 
space of any matrix Lie group G into G. In many cases exp is onto G, and in 
all cases the algebraic structure of G has a parallel structure on the tangent 
space, called the Lie algebra of G. In particular, the conjugation operation 
on G, which reflects the departure of G from commutativity, corresponds 
to an operation on the tangent space called the Lie bracket. 

We illustrate the exp function on matrices with the simplest nontrivial 
example, the affine group of the line. 

74 J. Stillwell, Naive Lie Theory , DOI: 10. 1007/978-0-387-78214-0.4. 
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4.1 The exponential map onto SO(2) 

The relationship between the exponential function and the circle, 

e ,e = cos 0 + /sin 0, 


was discovered by Euler in his book Introduction to the Analysis of the 
Infinite of 1748. One way to see why this relationship holds is to look at 
the Taylor series for e x , cosx, and sinx, and to suppose that the exponential 
series is also meaningful for complex numbers. 


2 3 4 5 

v X X X X X 

e =1 + U + 2! + 3! + 4! + 5! + ' 
x 2 x 4 

cosx= 1- — + — , 

3 5 

X X X 

smx= n-3l + J!-'-- 


The series for e K is absolutely convergent, so we may substitute id for x and 
rearrange terms. This gives a definition of e‘ e and justifies the following 
calculation: 


je 


id d 2 id 3 0 4 id 5 

+ it - 2 \ - ir + 4f + _ ' 


. , d 2 0 4 

“ 1 1_ 2! + 4! 


0 0 3 0 5 

+ jl II _ 3! + 5! _ ' 


= cos 0 + / sin 0 . 


Thus the exponential function maps the imaginary axis Ri of points id onto 
the circle S 1 of points cos 0 + / sin 0 in the plane of complex numbers. 

The operations of addition and negation on Ri cany over to multipli- 
cation and inversion on S 1 , since 

e w t e w 2 = e i(e i+ e 2 ) and (V 0 ) l = e~ w . 


There is not much more to say about S 1 , because multiplication of 
complex numbers is a well-known operation and the circle is a well-known 
curve. However, we draw attention to one trifling fact, because it proves to 
have a more interesting analogue in the case of that we study in the next 
section. The line of points id mapped onto S 1 by the exponential function 
can be viewed as the tangent to S 1 at the identity element 1 (Figure 4.1). Of 
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course, the points on the tangent are of the form 1 + id, but we ignore their 
constant real part 1. The essential coordinate of a point on the tangent is its 
imaginary part id, giving its height 6 above the x-axis. Note also that the 
point id at height 6 is mapped to the point cos 6 + / sin 6 at arc length 6. 
Thus the exponential map preserves the length of sufficiently small arcs. 


y 



Euler’s discovery that the exponential function can be extended to the 
complex numbers, and that it can thereby map a straight line onto a curve, 
was just the beginning. In the next section we will see that a further exten- 
sion of the exponential function can map the flat three-dimensional space 
M 3 onto a curved one, § 3 , and in the next chapter we will see that such 
exponential mappings exist in arbitrarily high dimensions. 


Exercises 


The fundamental property of the exponential function is the addition formula, 
which tells us that exp maps sums to products, that is, 

e A+B = e A e B . 


However, we are about to generalize the exponential function to objects that do 
not enjoy all the algebraic properties of real or complex numbers, so it is important 
to investigate whether the equation e A+B = e A e B still holds. The answer is that it 
does, provided AB = BA. 

We assume that 


e 


x 


X 

IT 



where 1 is the identity object. 


4.2 The exponential map onto SU(2) 
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4.1.1 Assuming that AB = BA, show that 


(A+B) m =A m +(™^jA m ~ l B+ 


A m ~ 2 B 2 - 


m 

m — 1 


AB m -'+B m . 


where ('”) denotes the number of ways of choosing l things from a set of 
m things. 

4.1.2 Show that (7) = ' )(«-2)-(m— /+1) = 

4.1.3 Deduce from Exercises 4.1.1 and 4.1.2 that the coefficient of ,4'" 1 B 1 in 

A + B (. A + B ) 2 (A + B) 3 


e A+B = l + ‘ 

1 ! 2 ! 

is 1 /l\(m — /)! when AB = BA. 

4.1.4 Show that the coefficient of A m_/ B / in 
A A 2 A 3 

f IT + 2!" + 37 


3! 


B B 2 B 3 

IT + 2\ + 37 


is also 1 // ! (m — l ) !, and hence that e A+B = e A e B when AB = BA. 


4.2 The exponential map onto SU(2) 

If u = b\ + cj + dk is a unit vector in Mi + Mj + Mk, then ir = — 1 by the 
argument at the end of Section 1.4. This leads to the following elegant 
extension of the exponential map from pure imaginary numbers to pure 
imaginary quaternions. 

Exponentiation theorem for HI. When we write an arbitrary element of 
Mi + Mj + Mk in the form 9u, where u is a unit vector, we have 

e dl ‘ = cos 6 + u sin 6 


and the exponential function maps Mi + Mj + Mk onto = SU(2). 


Proof. For any pure imaginary quaternion v we define e v by the usual 
infinite series 


2 3 

, V V V 

* - 1 + l! + 2! + 3! + '' 


This series is absolutely convergent in HI for the same reason as in C: for 
sufficiently large n, |v|7 n\ < 2~ n . Thus e v is meaningful for any pure 
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imaginary quaternion v. If v = 6u, where u is a pure imaginary and \u\ = 1, 
then /r = — I by the remark above, and we get 

9u _ ^ 
e ~ + li~ ~v.~~y~ + 4T + ^T _ 6T 
o 2 e 4 \ fee 3 e 5 


~ V 1_ 2! + 4! 

= cos e + u sin0. 


+ M| l!“3! + 5! 


Also, a point a + bi + cj + dk £ § 3 can be written in the form 

a H — 7 == == - — \/b 2 + c 2 + cl 2 = a + u y/ b 2 + c 2 + d 2 , 

v & 2 + c 2 + d 2 

where u is a unit pure imaginary quaternion. Since cf + b 2 + c 2 + d 2 = 1 
for a quaternion a + M + cj + dk € § 3 , there is a real 0 such that 

a = cos0, \//j 2 + c 2 + r/ 2 = si n 0 . 

Thus any point in § 3 is of the form cos 0 + usin 0, and so the exponential 
map is from Mi + Rj + Mk onto § 3 . □ 

Up to this point, we have a beautiful analogy with the exponential map 
in C. The three-dimensional space Mi+Mj + Mkis the tangent space of the 
3-sphere § 3 = SU(2) at the identity element 1, as we will see in the next 
section. 

But the algebraic situation on § 3 is more interesting (if you like, more 
complex) than on S 1 . For a pair of elements n,v E § 3 we generally have 
uv / vu, and hence uvu 1 / v. Thus the element uvu ', the conjugate of v 
by u 1 , detects failure to commute. Remarkably, the conjugation operation 
on § 3 = SU(2) is reflected in a noncommutative operation on the tangent 
space Mi + Mj + Mk that we uncover in the next section. 


Exercises 

4 . 2.1 Show that the exponential function maps any line through O in Ri + Rj + Rk 
onto a circle of radius 1 in S 3 . 

Since we can have uv A vn for quaternions u and v, it can be expected, from the 
previous exercise set, that we can have e ll e v e u+v . 

4 . 2.2 Explain why i = 2 and j = e^ n ! 2 . 

4 . 2.3 Deduce from Exercise 4.2.2 that at least one of e m ^ 2 ^ n / 2 , <4 7r / 2 g l7r / 2 is not 
equal to e in / 2 +i 11 / 2 _ 


4.3 


The tangent space of SU(2) 
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4.3 The tangent space of SU(2) 

The space Mi + Rj + Mk mapped onto SU(2) by the exponential function 
is the tangent space at 1 of SU(2), just as the line Mi is the tangent line 
at 1 of the circle SO(2). But SU(2), unlike SO(2), cannot be viewed from 
“outside” by humans, so we need a method for finding tangent vectors from 
“inside” SU(2). This method will later be used for the higher-dimensional 
groups SO (n), SU(«), and so on. 

The idea is to view a tangent vector at 1 as the “velocity vector” of 
a smoothly moving point as it passes through 1. To be precise, consider 
a differentiable function of t, whose values q(l) are unit quaternions, and 
suppose that < 7 ( 0 ) = 1. Then the “velocity” q'( 0) at t = 0 is a tangent vector 
to SU(2), and all the tangent vectors to SU(2) at 1 are obtained in this way. 

The assumption that q(t) is a unit quaternion for each / in the domain 
of q means that 

q{t)q{t) = 1 , (*) 

because qq = \q\ 2 for each quaternion q, as we saw in Section 1.3. By 
differentiating (*), using the product rule, we find that 

q'(t)q(t)+q{t)q , (t) = 0. 

(The usual proof of the product rule applies, even though quaternions do 
not necessarily commute — it is a good exercise to check why this is so.) 
Then setting t = 0, and bearing in mind that q(0) = 1, we obtain 

</(o) +</(o) = 0. 

So, every tangent vector q'(0) to SU(2) satisfies 
q’{ 0 )+^( 0 )= 0 , 

which means that q'( 0) is a pure imaginary quaternion p. Conversely, if 
p is any pure imaginary quaternion, then pt E Mi + Mj + Mk for any real 
number t, and we know from the previous section that e pt £ SU(2). Thus 
q(t ) = e pt is a path in SU(2). This path passes through 1 when t = 0, and 
it is smooth because it has the derivative 

q\t) = pe pt . 

(To see why, differentiate the infinite series for e pt .) Finally, q'( 0) = p, 
because e° = 1 . Thus every pure imaginary quaternion is a tangent vector 
to SU(2) at 1, and so the tangent space of SU(2) at 1 is Mi + Mj + Mk. 
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This construction of the tangent space to SU(2) at 1 provides a model 
that we will follow for the so-called classical Lie groups in Chapter 5. In all 
cases it is easy to find the general form of a tangent vector by differentiating 
the defining equation of the group, but one needs the exponential function 
(for matrices) to confirm that each matrix X of the form in question is in 
fact a tangent vector (namely, the tangent to the smooth path e tX ). 

The Lie bracket 

The great idea of Sophus Lie was to look at elements “infinitesimally close 
to the identity” in a Lie group, and to use them to infer behavior of ordi- 
nary elements. The modern version of Lie’s idea is to infer properties of 
the Lie group from properties of its tangent space. A commutative group 
operation, as on SO(2), is completely captured by the sum operation on the 
tangent space, because e x+y = e x e y . The real secret of the tangent space is 
an extra structure called the Lie bracket operation, which reflects the non- 
commutative content of the group operation. (For a commutative group, 
such as SO(2), the Lie bracket on the tangent space is always zero.) 

In the case of SU(2) we can already see that the sum operation on 
Mi + Mj + Mk is commutative, so it cannot adequately reflect the product 
operation on SU(2). Nor can the product on SU(2) be captured by the 
quaternion product on Mi + Mj + Mk, because the quaternion product is not 
always defined on Mi + Mj + Mk. For example, i belongs to Mi + Mj + Mk 
but the product i 2 does not. What we find is that the noncommutative 
content of the product on SU(2) is captured by the Lie bracket of pure 
imaginary quaternions U, V defined by 

[17, V] = UV-VU. 

This comes about as follows. Suppose that u(s) and v(f) are two smooth 
paths through 1 in SU(2), with u{ 0) = v(0) = 1. For each fixed s we con- 
sider the path 

w s {t) = u(s)v(t)u(s)~ l . 

This path also passes through 1, and its tangent there is 

w4(0) = u(s)v' (0)u(s)~ 1 = u(s)Vu(s )~ 1 , 

where V = v'(0) is the tangent vector to v(t) at 1. Now w' (0) is a tangent 
vector at 1 for each s, so (letting s vary) 

x(^) = u(s)Vu(s) 


l 
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is a smooth path in Mi + Mj + Mk. The tangent T(0) to this path at s =0 is 
also an element of Mi + Mj + Mk, because x'{0) is the limit of differences 
between elements of Mi + Mj + Mk, and Mi + Mj + Mk is closed under dif- 
ferences and limits. By the product rule for differentiation, and because 
u{ 0) = 1, the tangent vector x' {{)) is 

u{s)Vu(sY x = //(O)Vii(O) -1 + u(0)V (-k'(O)) 

i=0 

= UV-VU, 

where U = u' { 0) is the tangent vector to u[s) at 1. 

It follows that ifU,V € Mi + Mj + Mk then [t/, V ] 6 Mi + Mj + Mk. It is 
possible to give a direct algebraic proof of this fact (see exercises). But the 

proof above shows the connection between the conjugate of v(t) by uis) 1 

and the Lie bracket of their tangent vectors, and it generalizes to a proof 
that U, V E 7) (G) implies [U,V] E T\ (G) for any matrix Lie group G. In 
fact, we revisit this proof in Section 5.4. 


d_ 

ds 


Exercises 


The definition of derivative for any function c(t) of a real variable t is 


c'(t) = lim 
Ar— >0 


c(t + At) — c(t) 
At 


4.3.1 By imitating the usual proof of the product rule, show that if c{t) = a(t)b(t) 
then 

c(t) = a(t)b(t) + a{t)b'{t). 

(Do not assume that the product operation is commutative.) 

4.3.2 Show also that if c{t) = a(f)^ 1 , and a(0) = 1, then c'(0) = — ^'(O), again 
without assuming that the product is commutative. 

4.3.3 Show, however, that if c(t) = a(t) 2 then c'(t) is not equal to 2a(t)a'(t) for a 
certain quaternion-valued function a(t). 

To investigate the Lie bracket operation on Mi + Mj + Mk, it helps to know what 
it has in common with more familiar product operations, namely bilinearity, for 
any real numbers a\ and U 2 , 


[aiUi+a 2 U 2 ,V] = ai[Ui,V]+a 2 [U 2 ,V], [U,aiVi+a 2 V 2 ]=ai[U,V 1 ]+a 2 [U,V 2 ]. 

4.3.4 Deduce the bilinearity property from the definition of [U,V], 

4.3.5 Using bilinearity, or otherwise, show that U,V £ Mi + Mj + Mk implies 
[U,V] G Mi + Mj + Mk. 
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4.4 The Lie algebra su(2) of SU(2) 

The tangent space Mi + Mj + Mk of SU (2) is a real vector space, or a vector 
space over M. That is, it is closed under the vector sum operation, and also 
under multiplication by real numbers. The additional structure provided by 
the Lie bracket operation makes it what we call su(2), the Lie algebra of 
SU(2). 2 In general, a Lie algebra is a vector space with a bilinear operation 
[ , ] satisfying 


[x,y] + [F,x] = o, 

[X,[Y,Z}} + [Y,[Z 1 X]} + [Z,[X,Y}] = 0. 

These algebraic properties look like poor relations of the commutative and 
associative laws, and no doubt they seem rather alien at first. Nevertheless, 
they are easily seen to be satisfied by the Lie bracket [U,V] = UV — VU on 
Mi + Mj + Mk and, more generally, on any vector space of matrices closed 
under the operation U, V i— > UV — VU (see exercises). In the next chapter 
we will see that the tangent space of any so-called classical group is a Lie 
algebra for much the same reason that su(2) is. 

What makes su(2) particularly interesting is that it is probably the only 
nontrivial Lie algebra that anyone meets before studying Lie theory. Its 
Lie bracket is not as alien as it looks, being essentially the cross product 
operation on M 3 that one meets in vector algebra. 

To see why, consider the Lie brackets of the basis vectors i, j, and k of 
Mi + Mj + Mk, which are 

[U] = ij-ji = k + k = 2k, 

[j,k] = jk-kj = i + i = 2i, 

[k,i] = ki-ik = j+j = 2j. 

Then, if we introduce the new basis vectors 

f = i/2, j'=j/2, k' = k/2, 

we get 


2 It is traditional to denote the Lie algebra of a Lie group by the corresponding lower case 
Fraktur (also called German or Gothic) letter. Thus the Lie algebra of G will be denoted by 

g, the Lie algebra of SU(n) by su(n), and so on. 
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The latter equations are precisely the same as those defining the cross prod- 
uct on the usual basis vectors. 

This probably makes it clear that the cross product on M 3 is “the same” 
as the Lie bracket on Mi + Mj + Mk, but we can spell out precisely why 
by setting up a 1-to-l correspondence between Mi + Mj + Mk and M 3 that 
preserves the vector sum and scalar multiples (the vector space operations), 
while sending the Lie bracket to the cross product. 

The map <p : M + cj + dk i— >• (2b, 2c, 2d) is a 1-to-l correspondence that 
preserves the vector space operations, and it also sends i', j\ k and their 
Lie brackets to i, j, k and their cross products, respectively. It follows that 
(p sends all Lie brackets to the corresponding cross products, because the 
Lie bracket of arbitrary vectors, like the cross product of arbitrary vectors, 
is determined by its values on the basis vectors (by bilinearity). 


Exercises 


The second property of the Lie bracket is known as the Jacobi identity, and all 
beginners in Lie theory are asked to check that it follows from the definition 
[X,Y] =XY -YX. 

4 . 4.1 Prove the Jacobi identity by using the definition [X,Y] = XY — YX to ex- 
pand [X, [T,Z]] + [T, [Z,X]] + [Z, [X,y]]. Assume only that the product is 
associative and that the usual laws for plus and minus apply. 

4 . 4.2 Using known properties of the cross product, or otherwise, show that the 
Lie bracket operation on su(2) is not associative. 

In the words of Kaplansky [1963], p. 123, 

. . . the commutative and associative laws, so sadly lacking in the Lie 
algebra itself, are acquired under the mantle of /. 

By / he means a certain inner product, called the Killing form. A special case of 
it is the ordinary inner product on M 3 , for which we certainly have commutativity: 
u ■ v = v • u. “Associativity under the mantle of the inner product” means 

(m x v) • w = u ■ (v x tv). 

4 . 4.3 Show that if 


then 


u • (v x w) = 


bV2j 

+ V 3 k, 

w 

= Wli + W2j + W3k, 

U\ 

«2 

U3 


Vl 

l ; 2 

v 3 


W 1 

W2 

W3 



4 . 4.4 Deduce from Exercise 4.4.3 that (u x v) • w = u ■ (v x w). 
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4.5 The exponential of a square matrix 

We define the matrix absolute value of A = (a ;; ) to be 



For an 72 x n real matrix A the absolute value |A| is the distance from the 

2 

origin O in M" of the point 

(fill, <212, ■ • ■ Ain, <221, <222, • ■ • i a 2 n 5 • • • 5 &nl 5 • • • 5 &nn)' 

If A has complex entries, and if we interpret each copy of C as M 2 (as in 
Section 3.3), then |A| is the distance from O of the corresponding point in 
M 2 " . Similarly, if A has quaternion entries, then |A| is the distance from O 
of the corresponding point in M 4 "" . 

In all cases, |A — B\ is the distance between the matrices A and B, and 
we say that a sequence Ai,A 2 ,A 3 , . . . of n x n matrices has limit A if, for 
each e > 0, there is an integer M such that 

m > M =A> \A m — A\ <e. 

The key property of the matrix absolute value is the following inequal- 
ity, a consequence of the triangle inequality (which holds in the plane and 
hence in any M*) and the Cauchy-Schwarz inequality. 

Submultiplicative property. For any two real n x n matrices A and 5, 
\AB\<\A\\B\. 

Proof. If A = (a if) and B = (bif), then it follows from the definition of 
matrix product that 

| (i, j ) -entry of AB\ = \a a bij + a a b 2 j H b a in b nj \ 

A \ai\ b\j\ -j- |fl(2^2_/| A ■ ■ ■ A \a,i n b n j | 
by the triangle inequality 
— |<2(t | \b\j | A |<2(2 1 1 b 2 j | A ■ ■ ■ A | 1 1 b n j | 

by the multiplicative property of absolute value 

< | 2 H b I<2m| 2 \J\b\j\ 2 4 b | b n j | 2 

by the Cauchy-Schwarz inequality. 
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Now, summing the squares of both sides, we get 
\ AB \ 2 = Xl(hj) -entry of AB\ 2 

hj 

— ^ (| a il |“ H h |Om| 2 ) {\ b \ jV 4 1" |^n/|~) 

hi 

= X (K'l | 2 l°/n|“) X (l^ul 2 ^ ^ \ b nj\") 

< j 

= |A| 2 |i?| 2 , as required □ 


It follows from the submultiplicative property that |A m | < \A\ m . Along 
with the triangle inequality \A + B\ < |A| + |5|, the submultiplicative prop- 
erty enables us to test convergence of matrix infinite series by comparing 
them with series of real numbers. In particular, we have: 

Convergence of the exponential series. If A is any nxn real matrix, then 
A A 2 A 3 

1 + — + ^y + — H , where 1 = nxn identity matrix, 

2 

is convergent in . 

Proof. It suffices to prove that this series is absolutely convergent, that is, 
to prove the convergence of 


... A A 2 A 3 

1 + — + - — L + J — L + - 
11 1! 2! 3! 


This is a series of positive real numbers, whose terms (except for the first) 
are less than or equal to the corresponding terms of 


, , W.jA.jA , 

1! 2! 3! 


by the submultiplicative property. The latter series is the series for the real 
exponential function e A ; hence the original series is convergent. □ 

Thus it is meaningful to make the following definition, valid for real, 
complex, or quaternion matrices. 
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Definition. The exponential of any n x n matrix A is given by the series 


A A A 1 A 5 

e — 1 H b * * 

1! 2! 3! 


The matrix exponential function is a generalization of the complex and 
quaternion exponential functions. We already know that each complex 
number z = a + bi can be represented by the 2 x 2 real matrix 



and it is easy to check that e z is represented by e z . We defined the quater- 
nion q = a + In + cj + c/k to he the 2x2 complex matrix 

_ fa + di —b + ci\ 

® = \b + ci a-di)' 

so the exponential of a quaternion matrix may be represented by the expo- 
nential of a complex matrix. 

From now on we will often denote the exponential function simply by 
exp, regardless of the type of objects being exponentiated. 

Exercises 

The version of the Cauchy-Schwarz inequality used to prove the submultiplicative 
property is the real inner product inequality \u ■ vj < |u||vj, where 

u= (|«,i |, |«,- 2 |, - - - , |am|) and v= (\bp\, \bj 2 \, . ■ ■ , \b jn \) . 

It is probably a good idea for me to review this form of Cauchy-Schwarz, since 
some readers may not have seen it. 

The proof depends on the fact that w -w = |w| 2 > 0 for any real vector w. 

4 . 5.1 Show that 0 < (u+xv) ■ (u +xv) = |w| 2 + 2(n • v)x + x 2 |v| 2 = q(x), for any 
real vectors n, v ; and real number x. 

4 . 5.2 Use the positivity of the quadratic function q{x) found in Exercise 4.5.1 to 
deduce that 

(2m- v) 2 — 4|m| 2 |v| 2 < 0, 

that is, | m • v| < |m| |v|. 


4.6 The affine group of the line 
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Matrix exponentiation gives another proof that e ,e = cos 9 + / sin 9, since we 
can interpret i9 as a 2 x 2 real matrix A. 

4.5.3 Show, directly from the definition of matrix exponentiation, that 

A _{ 0 — (A A f cos 9 — sintA 

V# 0 J == ’ > C ^sin© cos0 J ' 

The exponential of an arbitrary matrix is hard to compute in general, but easy 
when the matrix is diagonal, or diagonalizable. 

4.5.4 Suppose that D is a diagonal matrix with diagonal entries Ai, A 2 , . . . , A*. By 
computing the powers D n show that e D is a diagonal matrix with diagonal 
entries e^ 1 , e^ 1 , . . . , e^ k . 

4.5.5 If A is a matrix of the form BCB 1 , show that e A = Be c B 1 . 

4.5.6 By term-by-term differentiation, or otherwise, show that jjje lA = Ae tA for 
any square matrix A. 


4.6 The affine group of the line 

Transformations of M of the form 

fa,b(x) = ax + b, where and a > 0, 

are called affine transformations. They form a group because the product of 
any two such transformations is another of the same form, and the inverse 
of any such transformation of another of the same form. We call this group 
Aff( 1 ) , and we can view it as a matrix group. The function f a j, corresponds 
to the matrix 


F a ,b 



applied on the left to 



because 

fa b\ fx\ _ fax + b\ 

Vo vVv V i ) ' 

Thus Aff(l) can be viewed as a group of 2 x 2 real matrices, and hence 
it is a geometric object in M 4 . On the other hand, Aff(l) is intrinsically 
two-dimensional, because its elements form a half-plane. To see why, 
consider first the two-dimensional subspace of M 4 consisting of the points 
(a,b, 0,0). This is a plane, and hence so is the set of points (a,b, 0,1) 
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obtained by translating it by distance 1 in the direction of the fourth coor- 
dinate. Finally, we get half of this plane by restricting the first coordinate 
to a > 0. 

Aff(l) is closed under nonsingular limits; hence it is a two-dimensional 
matrix Lie group, like the real vector space M 2 under vector addition, and 
the torus S 1 xS 1 . Unlike these two matrix Lie groups, however, Aff(l) is 
not abelian. For example, 


/2,i/i,2(*) — l(2x+ 1) + 2 — 2x + 3, 


whereas 


fi, 2 f 2 ,i(x) — 2(1* + 2) + 1 — 2x + 5. 


Aff( 1 ) is in fact the only connected, nonabelian two-dimensional Lie group. 
This makes it interesting, yet still amenable to computation. As we will 
see, it is easy to compute its tangent vectors, and to exponentiate them, 
from first principles. But first note that there are two ways in which Aff(l) 
differs from the Lie groups studied in previous chapters. 

• As a geometric object, Aff(l) is an unbounded subset of M 4 (because 
b can be arbitrary and a is an arbitrary positive number). We say that 
it is a noncompact Lie group, whereas SO(2), SO(3), and SU(2) 
are compact. In Chapter 8 we give a more precise discussion of 
compactness. 

• As a group, it admits an °°-to- 1 homomorphism onto another infinite 
group. The homomorphism <p in question is 



This sends the infinitely many matrices F a b, as b varies, to the matrix 
F a fi, and it is easily checked that 


(-^ai ,b\ ) iPa2,b2 ) • 


It follows, in particular, that Aff(l) is not a simple group. Also, the nor- 
mal subgroup of matrices in the kernel of (p is itself a matrix Lie group. 
The kernel consists of all the matrices that (p sends to the identity matrix, 
namely, the group of matrices of the form 


1 b 
0 1 


for 


4.6 The affine group of the line 
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Geometrically, this subgroup is a line, and the group operation corresponds 
to addition on the line, because 


(i in b ' t i2 ) 


The Lie algebra of Aff(l) 

Since Aff(l) is half of a plane in the space M 4 of 2 x 2 matrices, it is 
geometrically clear that its tangent space at the identity element is a plane. 
However, to find explicit matrices for the elements of the tangent space 
we look at the vectors from the identity element (q j) of Aff(l) to nearby 
points of Aff(l). 

These are the vectors 


fl + a J3\ (\ 0\ 

l o i) Vo V 






for small values of a and /3. Normally, one needs to find the limiting 
directions of these vectors (the “tangent vectors”) as a,/3 — > 0, but in this 
case all such directions lie in the plane spanned by the vectors 


J = 




The Lie bracket [n,v] = uv — vu on this two-dimensional space is deter- 
mined by the Lie bracket of the basis vectors: 


[J K]=K 


The exponential function maps the tangent space 1-to-l onto Aff(l), as 
one sees from some easy calculations with a general matrix in the 

tangent space. First, induction shows that 

fa i B\" _ f a” /3a" _1 \ 

^0 oj “\0 0 J ’ 


or, in terms of J and K, 


(aJ + /3K) n = a"J + pa n - l K. 
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Then substituting these powers in the exponential series (and writing 1 for 
the identity matrix) gives 


a aJ+pK 


— 1 + — (o;J + J3K) + — (aj + J3K)~ + • • • H — - (oj J + j8K)” + ■ ■ ■ 

1 ! 21 n\ 

= 1 + — (aj + J3K) + — (ct“J + /3aK) + • • • H — -(os™ J + /3 a” *K) 4 - • 

1 ! 2 ! 7i ! 


= l+f 


a a A a" \ , n 

Vll + 2! + "' + ^ + "j J+ ' i 


la a 

77 + — + ■■■ H 7 

1! 2! /i! 


«— 1 


K 


_(e a f(e«-l) 

1 P 

01 1 0 1 


if a = 0. 


The former matrix equals where a > 0, for a unique choice of a 

and /3 . First choose a so that a = e a ; then choose /3 so that 

b=—(e a —l) or b = [5 if a = 0. 

a 


Exercises 


Exponentiation of matrices does not have all the properties of ordinary exponen- 
tiation, because matrices do not generally commute. However, exponentiation 
works normally on matrices that do commute, such as powers of a fixed matrix. 
Here is an example in Aff(l). 


4 . 6.1 


Workout (oi) an d(oi) 3 . 


and then prove by induction that 


a b 
0 1 



4 . 6.2 Use the formula in Exercise 4.6.1 to work out the /7th power of the matrix 

1 ^ K , and compare it with the matrix e " a J+ n P K obtained by exponenti- 
ating naJ + npK. 

4 . 6.3 Show that the matrices ^ <4 ' for n = 1,2,3,..., lie on a line in M 4 . 

Also show that the line passes through the point (J J). 


4.7 


Discussion 
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4.7 Discussion 


The first to extend the exponential function to noncommuting objects was 
Hamilton, who applied it to quaternions almost as soon as he discovered 
them in 1843. In the paper Hamilton [1967], a writeup of an address to the 
Royal Irish Academy on November 13, 1843, he defines the exponential 
function for a quaternion q on p. 207, 


2 3 

„ , q q 

e q = 1 + - + — + — + ■ 

1 ^ 1 ^ 2! 3; 


and observes immediately that 

e q e q = e q +q when qq' = q'q. 


On p. 225 he evaluates the exponential of a pure imaginary quaternion, 
stating essentially the result of Section 4.2, that 

e du = cos 0 + nsinO when |w| = 1. 

The exponential map was extended to Lie groups in general by Lie 
in 1888. From his point of view, exponentiation sends “infinitesimal” el- 
ements of a continuous group to “finite” elements (see Hawkins [2000], 
p. 82). A few mathematicians in the late nineteenth century briefly noted 
that exponentiation makes sense for matrices, but the theory of matrix ex- 
ponentiation did not flourish until Wedderburn [1925] proved the submulti- 
plicative property of the matrix absolute value that guarantees convergence 
of the exponential series for matrices. The trailblazing investigation of von 
Neumann [1929] takes Wedderburn’s result as its starting point. 

The matrix exponential function has many properties in common with 
the ordinary exponential, such as 

e x = lim f 1 + — \ " ■ 

n-Kx> \ n J 

We do not need this property in this book, but it nicely illustrates the idea 
of Lie (and, before him, Jordan [1869]), that the “finite” elements of a 
continuous group may be “generated” by its “infinitesimal” elements. If X 
is a tangent vector at 1 to a group G and n is “infinitely large,” then 1 + j 
is an “infinitesimal” element of G. By iterating this element n times we 
obtain the “finite” element e x of G. 
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It was discovered by Lie’s colleague Engel in 1890 that, in the group 
SL(2,C) of 2 x 2 complex matrices with determinant 1, not every element 
is an exponential. In particular, the matrix ( () ' J, ) is not the exponen- 
tial of any matrix tangent to SL(2,C) at 1; hence it is not “generated by 
an infinitesimal element” of SL(2,C). (We indicate a proof in the exer- 
cises to Section 5.6.) The result was considered paradoxical at the time 
(see Hawkins [2000], p. 86), and its mystery was dispelled only when the 
global properties of Lie groups became better understood. In the 1920s it 
was realized that the topology of a Lie group is the key to its global behav- 
ior. For example, the paradoxical behavior of SL(2,C) can be attributed 
to its noncompactness, because it can be shown that every element of a 
connected, compact Lie group is the exponential of a tangent vector. We 
do not prove this theorem about exponentiation in this book, but we will 
discuss compactness and connectedness further in Chapter 8. 

For a noncompact, but connected, group G the next best thing to sur- 
jectivity of exp is the following: every g G G is the product e x ' e Xl ■■■ e Xk of 
exponentials of finitely many tangent vectors X\ ,Xi, . . . . W- This result is 
due to von Neumann [1929], and we give a proof in Section 8.6. 

For readers acquainted with differential geometry, it should be men- 
tioned that the exponential function can be generalized even beyond matrix 
groups, to Riemannian manifolds. In this setting, the exponential function 
maps the tangent space Tp(M) at point P on a Riemannian manifold M 
into M by mapping lines through O in Tp(M) isometrically onto geodesics 
of M through P. The Riemannian manifolds S 1 = {z G C : |z| = 1} and 
§ 3 = {^GH:|< 7 | = 1}, and their tangent spaces M and M 3 , nicely illustrate 
the geodesic aspect of exponentiation. The exponential map sends straight 
lines through O in the tangent space isometrically to geodesic circles in 
the manifolds (to S 1 itself in C, and to the unit circles cos 6 + u sin 6 in H, 
which are geodesic because they are the largest possible circles in § 3 ). 


5 

The tangent space 


Preview 

The miracle of Lie theory is that a curved object, a Lie group G, can be 
almost completely captured by a flat one, the tangent space T\{G) of G at 
the identity. The tangent space of G at the identity consists of the tangent 
vectors to smooth paths in G where they pass through 1. A path A(t) in G 
is called smooth if its derivative A' ft) exists, and if A (0) = 1 we call A'(0) 
the tangent or velocity vector of A(t) at 1. 7j (G) consists of the velocity 
vectors of all smooth paths through 1. 

It is quite easy to determine the form of the matrix A'(0) for a smooth 
path Aft) through 1 in any of the classical groups, that is, the generalized 
rotation groups of Chapter 3 and the general and special linear groups, 
GL(n,C) and SL(n,C), we will meet in Section 5.6. For example, any 
tangent vector of SO (n) at 1 is an n x n real skew -symmetric matrix — a 
matrix X such that X + X T = 0. The problem is to find smooth paths in the 
first place. It is here that the exponential function comes to our rescue. 

As we saw in Section 4.5, e x is defined for any n x n matrix X by the 
infinite series used to define e x for any real or complex number x. This ma- 
trix exponential function provides a smooth path with prescribed tangent 
vector at 1, namely the path A(t) = e tX , for which A'(0) = X. In particular, 
it turns out that if X is skew-symmetric then e tX E SO (n) for any real t, so 
the potential tangent vectors to SO(n) are the actual tangent vectors. 

In this way we find that 7j(SO(n)) = {X E M„(M) : X +X T = 0}, where 
M„(M) is the space of n x n real matrices. The exponential function simi- 
larly enables us to find the tangent spaces of all the classical groups: O(n), 
SO(n), U(n), SU(n), Sp(n), GL(n,C), and SL(n,C). 


J. Stillwell, Naive Lie Theory , DOI: 10.1007/978-0-387-78214-0_5. 
© Springer Science+Business Media, LLC 2008 
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5.1 Tangent vectors of O(w), U(«), Sp(«) 

In a space S of matrices, a path is a continuous function t\—>A(t) G S, where 
t belongs to some interval of real numbers, so the entries ajj(t) of A(t) are 
continuous functions of the real variable t. The path is called smooth, or 
differentiable, if the functions ay ft) are differentiable. 

For example, the function 



is a smooth path in SO(2), while the function 



cos I? | — sinp 

sin III cos It I 


is a path in SO(2) that is not smooth at t = 0. 

The derivative A' ft) of a smooth Aft) is defined in the usual way as 


lim AjlAhAtl 


A/— >0 At 


and one sees immediately that A' (t) is simply the matrix with entries a^( t), 
where aijft ) are the entries of Aft). Tangent vectors at 1 of a group G of 
matrices are matrices X of the form 


X=A'(0), 


where Aft) is a smooth path in G with 4(0) = 1 (that is, a path “passing 
through 1 at time 0”)- Tangent vectors can thus be viewed as “velocity 
vectors” of points moving smoothly through the point 1, as in Section 4.3. 
For example, in SO(2), 


Mt) = (' 


cos Qt — sin6h 
sin Ot cos Ot 


is a smooth path through 1 because 4(0) = 1. And since 



— 0sin6h —0 cos Ot 
6 cosOt — 6 sin 8t 


the corresponding tangent vector is 


5.1 Tangent vectors of O(n), U (n), Sp(n) 
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In fact, all tangent vectors are of this form, so they form the 1 -dimensional 
vector space of real multiples of the matrix i = ( ° “ 1 ) . This confirms what 
we already know geometrically: SO(2) is a circle and its tangent space at 
the identity is a line. 

We now find the form of tangent vectors for all the groups O (n), U (n), 

T 

Sp(«) by differentiating the defining equation AA = 1 of their members 
A. (In the case of O (n), A is real, so A = A. In the cases of U(n) and Sp(n), 
A is the complex and quaternion conjugate, respectively.) 

Tangent vectors of O(n), U(«), Spin). The Tangent vectors X at 1 are 
matrices of the following forms ( where 0 denotes the zero matrix): 

(a) For 0(A), n x n real matrices X such that X +X T = 0. 

(b) For U (n), nxn complex matrices X such that X +X =0. 

(c) For Sp(n), n x n quaternion matrices X such that X +X =0. 

Proof, (a) The matrices A £ O(n) satisfy AA 1 = 1. Let A = A(t) be a 
smooth path originating at 1, and take d/dt of the equation 

A{t)A{t) T = 1. 

The product rule holds as for ordinary functions, as does = 0 because 

1 is a constant. Also, 4(A T ) = (fpA) 1 by considering matrix entries. So 
we have 

A'(t)A(t) T +A(t)A'(t) T = 0. 

Since A(0) = 1 = A(0) T , for t = 0 this equation becomes 

A'(0) +A'(0) T = 0. 

Thus any tangent vector X = A 7 (0) satisfies X +X T = 0. 

T 

(b) The matrices A E U(n) satisfy AA = 1. Again let A = A(t) be a 

T 

smooth path with A (0) = 1 and now take d/dt of the equation AA = 1. By 

considering matrix entries we see that jpA(t) = Aft). Then an argument 

— T 

like that in (a) shows that any tangent vector X satisfies X +X =0. 

(c) For the matrices A E Sp(n) we similarly find that the tangent vectors 

X satisfy X + X T = 0. □ 
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The matrices X satisfying X + X 1 = 0 are called skew -symmetric, be- 
cause the reflection of each entry in the diagonal is its negative. That is, 

Xji = —Xjj. In particular, all the diagonal elements of a skew-symmetric 

— T 

matrix are 0. Matrices X satisfying X+X = 0 are called skew-Hermitian. 
Their entries satisfy Xjj = -17/ and their diagonal elements are pure imag- 
inary. 

It turns out that all skew-symmetric n x n real matrices are tangent, 
not only to O (n), but also to SO (n) at 1. To prove this we use the matrix 
exponential function from Section 4.5, showing that e x S SO(n) for any 
skew-symmetric X, in which case X is tangent to the smooth path e tX in 
SO(n). 


Exercises 

To appreciate why smooth paths are better than mere paths, consider the following 
example. 

5.1.1 Interpret the paths B(t ) and C(t) above as paths on the unit circle, say for 
-nil < t < n/1. 

5.1.2 If B(t) or C(t) is interpreted as the position of a point at time t, how does 
the motion described by B(t) differ from the motion described by C(t)l 


5.2 The tangent space of SO(/i) 


In this section we return to the addition formula of the exponential function 
e A+B = e A e B when AB = BA, 


which was previously set as a series of exercises in Section 4.1. This for- 
mula can be proved by observing the nature of the calculation involved, 
without actually doing any calculation. The argument goes as follows. 

According to the definition of the exponential function, we want to 
prove that 


, A + B 
1+ — +' 


p+g)" , 


n\ 


- ( 1 + Yj + ' 


A n 

• H — - + ■ 
n\ 


B B" 

1 + — H 1 — - + ■ 

1 ! 72 ! 


This could be done by expanding both sides and showing that the coeffi- 
cient of A 1 B m is the same on both sides. But if AB = BA the calculation 
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involved is the same as the calculation for real numbers A and B , in which 
case we know that e A 1 B = e A e B by elementary calculus. Therefore, the 
formula is correct for any commuting variables A and B. 

Now, the beauty of the matrices X and X T appearing in the condition 
X + X T = 0 is that they commute ! This is because, under this condition, 

XX T = X(-X) = (~X)X = X T X. 

Thus it follows from the above property of the exponential function that 

e x e xT = e x+xT = e° = l. 

But also, e x ‘ = {e x ) [ because (X T )' n = (X m ) J and hence all terms in the 
exponential series get transposed. Therefore 

l = e x e xT = e x (e x ) T . 

In other words, ifX + X T = 0 then e x is an orthogonal matrix. 

Moreover, e x has determinant 1 , as can be seen by considering the path 
of matrices tX for 0 < t < 1 . For t = 0, we have tX = 0, so 

e ,x =e° = 1 , which has determinant 1. 

And, as t varies from 0 to 1, e tX varies continuously from 1 to e x . This 
implies that the continuous function det(e' x ) remains constant, because 
det = ±1 for orthogonal matrices, and a continuous function cannot take 
two (and only two) values. Thus we necessarily have det(e x ) = 1, and 
therefore ifX is an n x n real matrix with X +X T = 0 then e x G SO(n). 

This allows us to complete our search for all the tangent vectors to 
SO(/i) at 1. 

Tangent space of SO(/;). The tangent space of SO (n) consists of precisely 
the n x n real vectors X such that X + X 1 = 0. 

Proof. In the previous section we showed that all tangent vectors X to 
SO(/i) at 1 satisfy X + X 1 = 0. Conversely, we have just seen that, for any 
vector X with X + X T = 0, the matrix e x is in SO(«). 

Now notice that X is the tangent vector at 1 for the path A(t ) = e tX in 
SO (n). This holds because 



= XF 


tx 
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as in ordinary calculus. (This can be checked by differentiating the series 
for e tX .) It follows that A(t) has the tangent vector A'(0) = X at 1, and 
therefore each X such that X + X 1 = 0 occurs as a tangent vector to SO (n) 
at 1, as required. □ 

As mentioned in the previous section, a matrix X such that X + X T = 0 
is called skew-symmetric. Important examples are the 3x3 skew- 
symmetric matrices, which have the form 

(0 -x -y\ 

X = I x 0 — z ) • 

\y z 0 / 

Notice that sums and scalar multiples of these skew-symmetric matrices 
are again skew-symmetric, so the 3x3 skew-symmetric matrices form a 
vector space. This space has dimension 3, as we would expect, since it is 
the tangent space to the 3-dimensional space SO(3). Less obviously, the 
skew-symmetric matrices are closed under the Lie bracket operation 


[X 1 ,A 2 ]=X 1 A 2 -A 2 X 1 . 

Later we will see that the tangent space of any Lie group G is a vector space 
closed under the Lie bracket, and that the Lie bracket reflects the conjugate 
g\g 2 g\ 1 of g 2 by gj 1 E G. This is why the tangent space is so important 
in the investigation of Lie groups: it “linearizes” them without obliterating 
much of their structure. 

Exercises 

According to the theorem above, the tangent space of SO(3) consists of 3 x 3 real 
matrices X such that X = —X T . The following exercises study this space and the 
Lie bracket operation on it. 

5.2.1 Explain why each element of the tangent space of SO(3) has the form 

/0 -x —y\ 

X = x 0 — z I = xl + yj + zK, 

V z 0 / 

where 

/0 -1 0\ /0 0 -1\ /0 0 0 \ 

1=1 0 0 , J= 0 0 0 , K= 0 0 -1 . 

\0 0 0 / \1 0 0 / \0 1 0 J 


5.3 The tangent space of U(n), SU(n), Sp(n) 
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5.2.2 Deduce from Exercise 5.2.1 that the tangent space of SO(3) is a real vector 
space of dimension 3. 

5.2.3 Check that [I, J] = K, [J,K] = I, and [K,I] = J- (This shows, among other 
things, that the 3x3 real skew-symmetric matrices are closed under the 
Lie bracket operation.) 

5.2.4 Deduce from Exercises 5.2.2 and 5.2.3 that the tangent space of SO(3) un- 
der the Lie bracket is isomorphic to R 3 under the cross product operation. 


5.2.5 Prove directly that the n x n skew-symmetric matrices are closed under the 
Lie bracket, using X T = —X and Y r = —Y. 


The argument above shows that exponentiation sends each skew-symmetric 
X to an orthogonal e x , but it is not clear that each orthogonal matrix is obtainable 
in this way. Here is an argument for the case n = 3. 

/0 -6 0 \ 

5.2.6 Find the exponential of the matrix B = 0 0 0 

\0 0 0 ; 


5.2.7 Show that Ae B A r = e Al>,A ' for any orthogonal matrix A. 

5.2.8 Deduce from Exercises 5.2.6 and 5.2.7 that each matrix in SO(3) equals e x 
for some skew-symmetric X. 


5.3 The tangent space of U(/i), SU(n), Sp(n) 


We know from Sections 3.3 and 3.4 that U(n) and Sp(n), respectively, are 

— T 

the groups of n x n complex and quaternion matrices A satisfying AA = 1. 
This equation enables us to find their tangent spaces by essentially the same 
steps we used to find the tangent space of SO (n) in the last two sections. 
The outcome is also the same, except that, instead of skew-symmetric ma- 
trices, we get skew-Hermitian matrices. As we saw in Section 5.1, these 

— T 

matrices X satisfy X + X =0. 

Tangent space of U(n) and Sp(n). The tangent space of\J(n) consists of 

— T 

all the n x n complex matrices satisfying X + X =0. The tangent space 

J 

of Sp(n) consists of all n x n quaternion matrices X satisfying X +X = 0, 
where X denotes the quaternion conjugate ofX. 

Proof. From Section 5.1 we know that the tangent vectors at 1 to a space 

— T — T 

of matrices satisfying AA = 1 are matrices X satisfying X + X =0. 
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Conversely, suppose that X is any nxn complex (respectively, quater- 

— T 

nion) matrix such that X + X = 0 . It follows that 


— T 

X = -X 


and therefore 

XX T =X(-X) = (- X)X = X T X . 

This implies, by the addition formula for the exponential function for com- 
muting matrices, that 

T T 

l = e° = e x+x =e x e x . 

It is also clear from the definition of e x that e x = (e z ) T . So ifX is any 

— T 

n x n complex ( respectively ; quaternion ) matrix satisfying X + X =0 then 
e x is in U (n) ( respectively ; Sp(n ) ). It follows in turn that any such X is a 
tangent vector at 1. Namely, X = A'(0) for the smooth path A(t) = e ,x . □ 

In Section 5.1 we found the form of tangent vectors to 0(n) at 1, but 
in Section 5.2 we were able to show that all vectors of this form are in 
fact tangent to SO(w), so we actually had the tangent space to SO(n) at 1. 
An identical step from U(n) to SU(n) is not possible, because the tangent 
space of U(n) at 1 is really a larger space than the tangent space to SU(n). 
Vectors X in the tangent space of SU(n) satisfy the additional condition that 
Tr(X), the trace of X, is zero. (Recall the definition from linear algebra: 
the trace of a square matrix is the sum of its diagonal entries.) 

To prove that Tr(X) = 0 for any tangent vector X to SU(n), we use the 
following lemma about the determinant and the trace. 

Determinant of exp. For any square complex matrix A, 

det(e A ) =e T < A \ 

Proof. We appeal to the theorem from linear algebra that for any complex 
matrix A there is an invertible complex 3 matrix B and an upper triangular 
complex matrix T such that A = BT B 1 . 

The nice thing about putting A in this form is that 

[BTB~ x ) m = BTB~ l BTB~ 1 ■ -BTB~ l =BT"‘B 

3 The matrix B may be complex even when A is real. We then have an example of a 
phenomenon once pointed out by Jacques Hadamard: the shortest path between two real 

objects — in this case, detfe 4 ) and e Tr ( A ) — may pass through the complex domain. 
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and hence 


= X 1 T =* X^T W=Be T B~' 

.^m\ Tn m! / 


m> 0 


\m> 0 


It therefore suffices to prove det(<? 7 ) = e yril ] for upper triangular T, be- 
cause this implies 

det(c A ) = det (Be T B~ 1 ) = det{e T ) = e Tl{T) = e Tr(/?r/r ') = e Tr(A \ 

Here we are appealing to another theorem from linear algebra, which states 
that Tr(BC) = Tv(CB) and hence Tr (BCB ') = Tr(C) (exercise). 

To obtain the value of det(e r ) for upper triangular T, suppose that 



( in 

* 

* 

••• * > 


0 

hi 

* 

* 

T = 

0 

0 

^33 

* 


U 

0 


0 tnn ) 


where the entries marked * are arbitrary. From this one can see that 
• T 2 is upper triangular, with ith diagonal entry equal to tf h 


• T m is upper triangular, with ith diagonal entry equal to 

• e T is upper triangular, with ith diagonal entry equal to e 1 " , 
and hence 

det(e r ) = e tn e t22 ■ ■■e tnn = e tn+,22+ ~ +t "" = «? Tr(7 ’ ) , 

as required. □ 

Tangent space of SU(/i). The tangent space ofS\J(n ) consists of all n x n 
complex matrices X such that X + X =0 and Tr(X ) = 0. 

Proof. Elements of SU(n) are, by definition, matrices A E U(n) with 
det (A) = 1. We know that the A E U(n) are of the form e x with X +X T = 0 . 
The extra condition det(A) = 1 is therefore equivalent to 

1 = det (A) = det(c x ) = e Tr(x) 

by the theorem just proved. It follows that, given any A eu(n), 

A E SU(n) det (A) = 1 e Tl{x) = 1 Tr(X) = 0. 

Thus the tangent space of SU(n) consists of the n x n complex matrices X 
such that X + X T = 0 and Tr(X ) = 0. □ 
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Exercises 

Another proof of the crucial result det(V') = e Tr ' A ) uses less linear algebra but 
more calculus. It goes as follows (if you need help with the details, see Tapp 
[2005], p. 72 and p. 88). 

Suppose B(t) is a smooth path of n x n complex matrices with B( 0) = 1, let 
bjj(t) denote the entry in row i and column j of Bit), and let B,j(t) denote the 
result of omitting row i and column j. 

5.3.1 Show that 

det(B(t)) = ^ / (—l) J+1 b 1 j(t)det(B 1 j(t)), 

7=1 

and hence 

/7 , u (0)det(B lj (0)) + (7 U (0)4 det(B 1; (t)) . 

ai t = 0 


det(B(0) = i(-l)^ 

t—0 i= I 


5 . 3.2 Deduce from Exercise 5.3.1, and the assumption B( 0) = 1 , that 


d 

dt 


t = o 


det(fl(f))=*ii(0)+- 


det(Bn(f)). 


1=0 


5 . 3.3 Deduce from Exercise 5.3.2, and induction, that 

det(B(0) = ^n(O) + b' 22 (0) + • • ■+b' nn { 0) = Tr(5'(0)). 


d 

dt 


t = o 


We now apply Exercise 5.3.3 to the smooth path B(t ) = e ?A , for which B'( 0) = A, 
and the smooth real function 

/(f) = det(e ;A ), for which /( 0) = 1. 

By the definition of derivative. 


/'(f) = lim j- [det( e (' +/i ) A ) -det(e ?A ) 

5 . 3.4 Using the property det (MAI) = det(M) det(fV) and Exercise 5.3.3, show that 

det(e' A ) =/(f)Tr(A). 


/W=det (e ' A )| 


t = o 


5 . 3.5 Solve the equation for /(f) in Exercise 5.3.4 by setting /(f) = g(f)e ? ' Tr ( A ) 
and showing that g'(t ) = 0 , hence g(t) = 1 . (Why?) 

Conclude that det(e A ) = e T d A ). 
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The tangent space of SU(2) should be the same as the space Ri + Rj + Rk 
shown in Section 4.2 to be mapped onto SU(2) by the exponential function. This 
is true, but it requires some checking. 

5.3.6 Show that the skew-Hermitian matrices in the tangent space of SU(2) can 
be written in the form bi + cj + 4k, where b.c.d £ R and i, j, and k are 
matrices with the same multiplication table as the quaternions i, j, and k. 

5.3.7 Also find the tangent space of Sp(l) (which should be the same). 

Finally, it should be checked that Tr {XY) = Tr(KA'), as required in the proof 
that det(e /l ) = e Tr A). This can be seen almost immediately by meditating on the 
sum 


Xnyn d-x^yn H \-xi n y n i 

+X 2 iyi2 +X 2 2y22 H hx 2n y „2 


A Xn \y\n A x n 2 y 2 n A * • • A x nn y nn . 


5.3.8 Interpret this sum as both Tr (XY) and Tr(TX). 


5.4 Algebraic properties of the tangent space 


If G is any matrix group, we can define its tangent space at the identity, 
71(G), to be the set of matrices of the form X = A'(0), where A(t) is a 
smooth path in G with A (0) = 1. 

Vector space properties. 71(G) is a vector space over R; that is, for any 
X,Y E 71 (G) we have X A Y E T\ ( G ) and rX E 7j (G) for any real r. 

Proof. Suppose X = A'(0) and Y = B'(0) for smooth paths A(t),B(t) E G 
with A(0) = B( 0) = 1, so X,Y E 71(G). It follows that C(t) = A(t)B(t) is 
also a smooth path in G with C(0) = 1, and hence C'(0) is also a member 
of 7\(G). 

We now compute C'(0) by the product rule and find 


C'(0) 


d 

dt 


A(t)B(t) =A'(0)B(0)+A(0)B'(0) 

f=0 


= X+Y because A (0) = B(0) = 1. 


Thus X , Y E 7\(G) implies X AT E 71(G). 
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To see why rX E 71(G) for any real r, consider the smooth path D(t) = 
A(rt). We have D(0) = A( 0) = 1, so D'( 0) E 71(G), and 

D'(0) = rA'(O) = rX. 

Hence X E 71(G) implies rX E 71(G), as claimed. □ 

We see from this proof that the vector sum is to some extent an image 
of the product operation on G. But it is not a very faithful image, because 
the vector sum is commutative and the product on G generally is not. 

We find a product operation on 71 (G) that more faithfully reflects the 
product on G by studying the behavior of smooth paths A(s) and Bit) near 
1 when ^ and t vary independently. 

Lie bracket property. 71(G) is closed under the Lie bracket, that is, if 
X,Y E 71(G) then [X,F] E 71(G), where [X.Y] =XY-YX. 

Proof. Suppose A(0) = 5(0) = 1, A'(0) = X, £'(()) = Y, so X,Y E 71(G). 
Now consider the path 

C s (t) = A(s)B(t)A(s)~ 1 for some fixed value of s. 

Then C s (t) is smooth and C v (0) = 1, so C((0) E 71(G). But also, 

C(,(0) =A(s)5 / (0)A(s)~ 1 = A(s)FA(s)~ 1 

is a smooth function of s, because A(s) is. So we have a whole smooth path 
A(s)Y A(s) - ] in T\ (G), and hence its tangent (velocity vector) at s = 0 is 
also in 71(G). (This is because the tangent is the limit of certain elements 
of 71(G), and 71(G) is closed under limits.) 

This tangent is found by differentiating D{s) = A(s)FA(s) - 1 with re- 
spect to s at s = 0 and using A (0) = 1: 

D'(0) =A'(0)FA(0)“ 1 +A(0)F(— A'(0)) 

= XY — YX = [X,F], 

since A'(0) = X and A(0) = 1. Thus X,F E 71(G) implies [X,F] E 71(G), 
as claimed. □ 

The tangent space of G, together with its vector space structure and 
Lie bracket operation, is called the Lie algebra of G, and from now on we 
denote it by g (the corresponding lower case Fraktur letter). 


5.4 Algebraic properties of the tangent space 
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Definition. A matrix Lie algebra is a vector space of matrices that is closed 
under the Lie bracket \X ,Y] = XY — YX. 

All the Lie algebras we have seen so far have been matrix Lie algebras, 
and in fact there is a theorem {Ado ’s theorem) saying that every Lie algebra 
is isomorphic to a matrix Lie algebra. Thus it is not wrong to say simply 
“Lie algebra” rather than “matrix Lie algebra,” and we will usually do so. 

Perhaps the most important idea in Lie theory is to study Lie groups 
by looking at their Lie algebras. This idea succeeds because vector spaces 
are generally easier to work with than curved objects — which Lie groups 
usually are — and the Lie bracket captures most of the group structure. 

However, it should be emphasized at the outset that g does not always 
capture G entirely, because different Lie groups can have the same Lie 
algebra. We have already seen one class of examples. For all n, 0(n) is 
different from SO(n), but they have the same tangent space at 1 and hence 
the same Lie algebra. There is a simple geometric reason for this: SO(«) 
is the subgroup of O (n) whose members are connected by paths to 1. The 
tangent space to O(n) at 1 is therefore the tangent space to SO(n) at 1. 

Exercises 

If, instead of considering the path C s (t) = A(s)B(t)A(s) 1 in G we consider the 
path 

D s {t) = A(s)B(t)A{s)~ l B{t)~ 1 for some fixed value of s , 

then we can relate the Lie bracket \X,Y] of X . Y £ T\ (G) to the so-called commu- 
tator A (s)B(t)A(s) 1 B{t)~ 1 of smooth paths A(s) and B(t ) through 1 in G. 

5.4.1 Find Z)'(f), and hence show that Z)' (0) = A(s)7A(.s) _1 — Y. 

5.4.2 Z)'(0) £ 7i(G) (why?) and hence, as s varies, we have a smooth path E(s) = 
D'M in 7i(G) (why?). 

5.4.3 Show that the velocity E'{ 0) equals XY — YX, and explain why E'(0) is in 
71(G). 

The tangent space at 1 is the most natural one to consider, but in fact all 
elements of G have the “same” tangent space. 

5.4.4 Show that the smooth paths through any g € G are of the form gA (t), where 
A(f) is a smooth path through 1. 

5.4.5 Deduce from Exercise 5.4.4 that the space of tangents to G at g is isomor- 
phic to the space of tangents to G at 1. 
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5.5 Dimension of Lie algebras 

Since the tangent space of a Lie group is a vector space over M, it has a 
well-defined dimension over M. We can easily compute the dimension of 
so(n),u(n),su(n), and sp(n) by counting the number of independent real 
parameters in the corresponding matrices. 

Dimension of so(n) ,u(n) ,su(n), and s p(n). As vector spaces over M, 

(a) so («) has dimension n(n — l)/2. 

(b) u(n) has dimension rr. 

(c) s u(n) has dimension rr — 1. 

(d) sp (ji) has dimension n(2n + 1). 

Proof, (a) We know from Section 5.2 that so(/i) consists of all n x n real 
skew-symmetric matrices X. Thus the diagonal entries are zero, and the 
entries below the diagonal are the negatives of those above. It follows that 
the dimension of so (n) is the number of entries above the diagonal, namely 

72(72 — 1 ) 

1 + 2-1 f (n — 1) = . 

(b) We know from Section 5.3 that u (n) consists of all n x n complex 
skew-Hermitian matrices X. Thus X has n(n — 1) / 2 complex entries above 
the diagonal and n pure imaginary entries on the diagonal, so the number 
of independent real parameters in X is 

n(n — 1 ) + n = n 2 . 

(c) We know from Section 5.3 that su(n) consists of all n x n complex 
skew-Hermitian matrices with Tr(X) = 0. Without the Tr(X) = 0 condi- 
tion, there are n 2 real parameters, as we have just seen in (b). The condition 
Tr(X ) = 0 says that the 72 th diagonal entry is the negative of the sum of the 
remaining diagonal entries, so the number of independent real parameters 
is 77“ — 1. 

(d) We know from Section 5.3 that sp(7i) consists of all n x n quater- 
nion skew-Hermitian matrices X. Thus X has 72(72 — l)/2 quaternion entries 
above the diagonal and n pure imaginary quaternion entries on the diago- 
nal, so the number of independent real parameters is 

272(72 — 1) + 3n = 72(272 — 2 + 3) = 72 ( 272 + 1). 


□ 


5.6 Complexification 
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It seems geometrically natural that a matrix group G should have the 
same dimension as its tangent space T\ (G) at the identity, but to put this 
result on a firm basis we need to construct a bijection between a neigh- 
borhood of 1 in G and a neighborhood of 0 in T\ (G ) , continuous in both 
directions — a homeomorphism. This can be achieved by a deeper study of 
the exponential function, which we carry out in Chapter 7 (for other pur- 
poses). But then one faces the even more difficult problem of proving the 
invariance of dimension under homeomorphisms. Fortunately, Lie theory 
has another way out, which is simply to define the dimension of a Lie group 
to be the dimension of its Lie algebra. 

Exercises 

The extra dimension that U(n) has over SU(«) is reflected in the fact that the quo- 
tient group U(«)/SU(n) exists and is isomorphic to the circle group S 1 . Among 
other things, this shows that U(n) is not a simple group. Here is how to show that 
the quotient exists. 

5 . 5.1 Consider the determinant map det : U(n) — > C. Why is this a homomor- 
phism? What is its kernel? 

5 . 5.2 Deduce from Exercise 5.5.1 that SU(n) is a normal subgroup of U(n). 

Since the dimension of U(n) is 1 greater than the dimension of SU(n), we 
expect the dimension of U(n)/SU(n) to be 1. The elements ofU(«)/SU(n) cor- 
respond to the values of det(A), for matrices A € U(n), by the homomorphism 
theorem of Section 2.2. So these values should form a 1 -dimensional group — 
isomorphic to either M or S 1 . Indeed, they are points on the unit circle in C, as the 
following exercises show. 

T 

5 . 5.3 If A is an n x n complex matrix such that AA = 1 , show that | det(A) | = 1 . 

5 . 5.4 Give an example of a diagonal unitary matrix A, with det(A) = e'° . 


5.6 Complexification 

The Lie algebras we have constructed so far have been vector spaces over 
M, even though their elements may be matrices with complex or quaternion 
entries. Each element is an initial velocity vector A'(0) of a smooth path 
A(t), which is a function of the real variable t. It follows that, along with 
each velocity vector A^O), we have its real multiples rA'(O) for each r E M, 
because they are the initial velocity vectors of the paths A{rt). Thus the 
elements A'(0) of the Lie algebra admit multiplication by all real numbers 
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but not necessarily by all complex numbers. One can easily give examples 
(Exercise 5.6.1) in which a complex matrix A is in a certain Lie algebra but 
iA is not. 

However, it is certainly possible for a Lie algebra to be a vector space 
over C. Indeed, any real matrix Lie algebra g over M has a complexification 

0 + iQ = {A + iB : A,B £ g} 

that is a vector space over C. It is clear that g + ig is closed under sums, 
because g is, and it is closed under multiples by complex numbers because 

( a + ib ) (A + iB) = aA — bB + i{bA + aB) 

and aA — bB , l)A + aB £ g for any real numbers a and b. 

Also, g + ig is closed under the Lie bracket because 


[Ai + iBi,A 2 + iB 2 ] = [A i , A 2 ] - [B\,B 2 ] + i([B\,A 2 \ + [Ai,Z? 2 ]) 


by bilinearity, and [Ai,A 2 \,[B\,B 2 \,[B\,A 2 \,[A\,B 2 \ G g by the closure of g 
under the Lie bracket. Thus g + ig is a Lie algebra. 

Complexifying the Lie algebras u(n) and su(«), which are not vector 
spaces over C, gives Lie algebras that happen to be tangent spaces — of the 
general linear group GL(n. C) and the special linear group SL(n.C). 

GL(n,C) and its Lie algebra gl(n,C) 

The group GL(n,C) consists of all n x n invertible complex matrices A. It 
is clear that the initial velocity A'(0) of any smooth path A(t) in GL(n. C) is 
itself an n x n complex matrix. Thus the tangent space gl(«. C) of GL(«. C) 
is contained in the space M n { C) of all / 1 x n complex matrices. 

In fact, gl(«,C) = M n { C). We first observe that exp maps M n { C) into 
GL(/i. C) because, for any X £ M n ( C) we have 

• e x is an n x n complex matrix. 

• e x is invertible, because it has e x as its inverse. 

It follows, since tX G M n { C) for any X £ M n { C) and any real t, that e tX is 
a smooth path in GL(n. C). Then X is the tangent vector to this path at 1, 
and hence the tangent space g[(«,C) equals M n ( C), as claimed. 
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Now we show why gl(n,C) is the complexification of u (n): 
g[(n,C) =M n { C) = u(n) + iu(n). 

It is clear that any member of u(w) + iu(n) is in M„( C). So it remains to 
show that any X E M„( C) can be written in the form 

X = X] + 1X2 where Xi,X 2 Eu(n), (*) 

that is, where X\ and X 2 are skew-Hermitian. There is a suiprisingly simple 
way to do this: 

— T — T 

x-x x+x 

x ~ 2 + 1 2 i ' 

— T — T 

We leave it as an exercise to check that X\ = x ~ x and X 2 = x+x satisfy 

— T — T 

Xj+Xi = 0 = X2 T X2 , which completes the proof. 

As a matter of fact, for each X E gl (N, C) the equation (*) has a unique 
solution with Xi,X2 G u(«). One solves (*) by first taking the conjugate 
transpose of both sides, then forming 

X +x T = X! +X 7 T + i(x 2 - X2 T ) 

= i(X 2 — X2 T ) because X\ +X[ T = 0 
= 2/X2 because X2 + X2 T = 0 . 

x - x T = Xj - x7 t + UX2+T2) 

= Xi - X[ T because X 2 + X? = 0 
= 2 Xi because Xi +Xj T = 0 . 

— T — T 

Thus X\ = x ~* and X2 = are in fact the only values Xi,X 2 £ u(n) 
that satisfy (*). 

SL(«, C) and its Lie algebra sl(n. C) 

The group SL(«,C) is the subgroup of GL(n,C) consisting of the n x n 
complex matrices A with det(A) = 1 . The tangent vectors of SL(n,C) are 
among the tangent vectors X of GL(n,C), but they satisfy the additional 
condition Tr(X) = 0 . This is because e x £ GL(n,C) and 

det(e x ) = e Tr W = 1 Tr(X) = 0 . 
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Conversely, if X has trace zero, then so has tX for any real t, so a 
matrix X with trace zero gives a smooth path e tX in SL(n.C). This path 
has tangent X at 1, so 

s[(n,C) = {X £ Af„(C) : Tr(X) = 0}. 

We now show that the latter set of matrices is the complexification of 
su(n), su (n) + isu(n). Since any X £ su(n) has trace zero, any member of 
su(rc) + isu(n) also has trace zero. Conversely, any X £ M„( C) with trace 
zero can be written as 

X=X\+iX 2 , where X\,X 2 Gsu(n). 

We use the same trick as for u(«) + iu(n) ; namely, write 

— t — T 

x-x x+x 

x -~2~ +l ~ir- 

— T — T 

As before, X\ = x ~ x and X 2 = - +x are skew-Hermitian. But also, X\ 
and X 2 have trace zero, because X has. 

Thus, s[(A,C) = {X £ M n ( C) : Tr(X) = 0} = su(n) + isu(n), as 
claimed. 

Also, by an argument like that used above for g[(n,C), eachX£5l(n,C) 
corresponds to a unique ordered pair X\, X 2 of elements of su(n) such that 


X=Xi+iX 2 . 

This equation therefore gives a 1-to-l coiTespondence between the ele- 
ments X of sl(n, C) and the ordered pairs (A| .Xi) such that X\ ,X2 £ su(n). 

Exercises 

5.6.1 Show that u(n) and su(n) are not vector spaces over C. 

— T — T 

5.6.2 Check that X\ = x ~ x and X2 = x+x are skew-Hermitian, and that X] and 
X2 have trace zero when X has. 

5.6.3 Show that the groups GL(n,C) and SL(n,C) are unbounded (noncompact) 
when the matrix with (jA)-entry (cijk + ibjk ) is identified with the point 

2 n 2 

->b\\ •,Cl\2ib\2') • • • • • • i a nn->bnn) GM 

and distance between matrices is the usual distance between points in R 2 " 2 . 
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The following exercises show that the matrix A = ( 0 ' -i ) i n SL(2,C) is not 
equal to e x for any X <£ s((2.C), the 2x2 matrices with trace zero. Thus exp does 
not map the tangent space onto the group in this case. The idea is to calculate e x 
explicitly with the help of the Cayley-Hamilton theorem, which for 2x2 matrices 
X says that 

X 2 - (Tr(X))X + det(X)l = 0. 

Therefore, when Tr(X) = 0 we haveX 2 = — det(X)l. 

5 . 6.4 When X 2 = — det(X)l, show that 

/=» (v sm)i + ^i 

\J det(X) 

5 . 6.5 Using Exercise 5.6.4, and the fact that Tr(X) = 0, show that if 



then cos(^/det(X)) = — 1, in which case sin(-^/det(X)) = 0, and there is a 
contradiction. 

5 . 6.6 It follows not only that exp does not map sl(2,C) onto SL(2,C) but also 
that exp does not map gl(2,C) onto GL(2,C). Why? 

This is not our first example of a Lie algebra that is not mapped onto its 
group by exp. We have already seen that exp cannot map o(n) onto O (n) because 
o(n) is path-connected and O(n) is not. What makes the sf(«,C) and jj[(h,C) 
examples so interesting is that SL(n,C) and GL(n,C) are path-connected. We 
gave some results on path-connectedness in Sections 3.2 and 3.8, and will give 
more in Section 8.6, including a proof that GL(n,C) is path-connected. 

5 . 6.7 Find maximal tori, and hence the centers, of GL(n,C) and SL(n,C). 

5 . 6.8 Assuming path-connectedness, also find their discrete normal subgroups. 


5.7 Quaternion Lie algebras 

Analogous to GL(n,C), there is the group GL(n.HI) of all invertible nxn 
quaternion matrices. Its tangent vectors lie in the space M„( H) of all 
nxn quaternion matrices, and indeed each X e M„( H) is a tangent vec- 
tor, because the quaternion matrix e tX has the inverse e~ tX and hence lies 
in GL(n,H). So, for each X e M n { H) we have the smooth path e tX in 
GL(n, U) with tangent X. 

Thus the Lie algebra gl(n,BI) of GL(n,H) is precisely M„(H). 
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However, there is no “s[(n,H)” of quaternion matrices of trace zero. 
This set of matrices is closed under sums and scalar multiples but, because 
of the noncommutative quaternion product, not under the Lie bracket. For 
example, we have the following matrices of trace zero in M 2 (H): 


X = 




But their Lie bracket is 


XY-YX 


k 0 
0 k 




which does not have trace zero. 

The quaternion Lie algebra that interests us most is sp(«), the tangent 
space of Sp(n). As we found in Section 5.3, 


sp(n) = {X G Af„( H) : X +X T = 0}, 

where X denotes the result of replacing each entry of X by its quaternion 
conjugate. 

There is no neat relationship between sp(/i) and glfn.H) analogous 
to the relationship between su(ra) and ol(n. C). This can be seen by con- 
sidering dimensions: g[(«,EI) has dimension 4/i 2 over M, whereas sp(n) 
has dimension 2 n 2 + n, as we saw in Section 5.5. Therefore, we cannot 
decompose g[(«,H) into two subspaces that look like sp(n), because the 
dimensions do not add up. 

As a result, we need to analyze sp(n) from scratch, and it turns out to 
be “simpler” than gl(«,HI), in a sense we will explain in Section 6.6. 

Exercises 

5.7.1 Give three examples of subspaces of g[(n,H) closed under the Lie bracket. 

5.7.2 What are the dimensions of your examples? 

5.7.3 If your examples do not include one of real dimension 1, give such an ex- 
ample. 

5.7.4 Also, if you have not already done so, give an example g of dimension n 
that is commutative. That is, \X . Y\ = 0 for all X. Y gg. 


5.8 


Discussion 


113 


5.8 Discussion 

The classical groups were given their name by Hermann Weyl in his 1939 
book The Classical Groups. Weyl did not give a precise enumeration of the 
groups he considered “classical,” but it seems plausible from the content of 
his book that he meant the general and special linear groups, the orthogonal 
groups, and the unitary and symplectic groups. Weyl briefly mentioned 
that the concept of orthogonal group can be extended to include the group 
O (p,q) of transformations of preserving the (not positive definite) 
inner product defined by 

(ui,U2,...,Up,u[,U2,...,u' q ) ■(vi,V2,...,Vp,v' 1 ,V2,...,V^) 

= KlVl + U2V2 H b UpVp - u\ v\ - l/ 2 v ' 2 u' q V q . 

An important special case is the Lorentz group 0(1,3), which defines 
the geometry of Minkowski space — the “spacetime” of special relativity. 
There are also “p,q generalizations” of the unitary and symplectic groups, 
and today these groups are often considered “classical.” However, in this 
book we apply the term “classical groups” only to the general and special 
linear groups, and 0(n), SO(n), U(w), SU(n), and Sp(n). 

Weyl also introduced the term “Lie algebra” (in lectures at Princeton in 
1934-35, at the suggestion of Nathan Jacobson) for the collection of what 
Lie had called the “infinitesimal elements of a continuous group.” 

The Lie algebras of the classical groups were implicitly known by Lie. 
However, the description of Lie algebras by matrices was taken up only 
belatedly, alongside the late-dawning realization that linear algebra is a 
fundamental part of mathematics. As we have seen, the serious study of 
matrix Lie groups began with von Neumann [1929], and the first examples 
of nonmatrix Lie groups were not given until 1936. At about the same 
time, I. D. Ado showed that linear algebra really is an adequate basis for 
the theory of Lie algebras, in the sense that any Lie algebra can be viewed 
as a vector space of matrices. 

As late as 1946, Chevalley thought it worthwhile to point out why it is 
convenient to view elements of matrix groups as exponentials of elements 
in their Lie algebras: 

The property of a matrix being orthogonal or unitary is defined 
by a system of nonlinear relationships between its coefficients; 
the exponential mapping gives a parametric representation of 
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the set of unitary (or orthogonal) matrices by matrices whose 
coefficients satisfy linear relations. 

Chevalley [1946] is the first book, as far as I know, to explicitly describe 
the Lie algebras of orthogonal, unitary, and symplectic groups as the spaces 
of skew-symmetric and skew-Hermitian matrices. 

The idea of viewing the Lie algebra as the tangent space of the group 
goes back a little further, though it did not spring into existence fully 
grown. In von Neumann [1929], elements of the Lie algebra of a ma- 
trix groups G are taken to be limits of sequences of matrices in G, and von 
Neumann’s limits can indeed be viewed as tangents, though this fact is not 
immediately obvious (see Section 7.3). The idea of defining tangent vec- 
tors to G via smooth paths in G seems to originate with Pontrjagin [1939], 
p. 183. The full-blooded definition of Lie groups as smooth manifolds and 
Lie algebras as their tangent spaces appears in Chevalley [1946]. 

In this book I do not wish to operate at the level of generality that 
requires a definition of smooth manifolds. However, a few remarks are 
in order, since the concept of smooth manifold includes some objects that 
do not look “smooth” at first sight. For example, a single point is smooth 
and so is any finite set of points. This has the consequence that { 1 ,- 1 } 
is a smooth subgroup of SU(2), and also of SO (n) for any even n. The 
reason is that a smooth group should have a tangent space at every point, 
but nobody said the tangent space has to be big ! 

“Smoothness” of a k-dimensional group G should imply that G has a 
tangent space isomorphic to M /l at 1 (and hence at any point), but this in- 
cludes the possibility that the tangent space is K° = { 0 }. We must therefore 
accept groups as “smooth” if they have zero tangent space at 1, which is 
the case for {1}, {1, —1}, and any other finite group. In fact, finite groups 
are included in the definition of “matrix Lie group” stated in Section 1.1, 
since they are closed under nonsingular limits. 

Nevertheless, the presence of nontrivial groups with zero tangent space, 
such as {1, —1}, complicates the search for simple groups. If a group G is 
simple, then its tangent space g is a simple Lie algebra, in a sense that will 
be defined in the next chapter. Simple Lie algebras are generally easier to 
recognize than simple Lie groups, so we find the simple Lie algebras g first 
and then see what they tell us about the group G. A good idea — except that 
g cannot “see” the finite subgroups of G, because they have zero tangent 
space. Simplicity of g therefore does not rule out the possibility of finite 
normal subgroups of G, because they are “invisible” to g. This is why we 
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took the trouble to find the centers of various groups in Chapter 3. It turns 
out, as we will show in Chapter 7, that g can “see” all the normal subgroups 
of G except those that lie in the center, so in finding the centers we have 
already found all the normal subgroups. 

The pioneers of Lie theory, such as Lie himself, were not troubled by 
the subtle difference between simplicity of a Lie group and simplicity of its 
Lie algebra. They viewed Lie groups only locally and took members of the 
Lie algebra to be members of the Lie group anyway (the “infinitesimal” el- 
ements). For the pioneers, the problem was to find the simple Lie algebras. 
Lie himself found almost all of them, as Lie algebras of classical groups. 
But finding the remaining simple Lie algebras — the so-called exceptional 
Lie algebras — was a monumentally difficult problem. Its solution by Wil- 
helm Killing around 1890, with corrections by Elie Cartan in 1894, is now 
viewed as one of the greatest achievements in the history of mathematics. 

Since the 1920s and 1930s, when Lie groups came to be viewed as 
global objects and Lie algebras as their tangent spaces at 1, the question of 
what to say about simple Lie groups has generally been ignored or fudged. 
Some authors avoid saying anything by defining a simple Lie group to be 
one whose Lie algebra is simple, often without pointing out that this con- 
flicts with the standard definition of simple group. Others (such as Bour- 
baki [1972]) define a Lie group to be almost simple if its Lie algebra is 
simple, which is another way to avoid saying anything about the genuinely 
simple Lie groups. 

The first paper to study the global properties of Lie groups was Schreier 
[1925]. This paper was overlooked for several years, but it turned out to 
be extremely prescient. Schreier accurately identified both the general role 
of topology in Lie theory, and the special role of the center of a Lie group. 
Thus there is a long-standing precedent for studying Lie group structure as 
a topological refinement of Lie algebra structure, and we will take up some 
of Schreier’s ideas in Chapters 8 and 9. 
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Preview 

In this chapter we return to our original motive for studying Lie algebras: 
to understand the structure of Lie groups. We saw in Chapter 2 how normal 
subgroups help to reveal the structure of the groups SO(3) and SO(4). To 
go further, we need to know exactly how the normal subgroups of a Lie 
group G are reflected in the structure of its Lie algebra g. 

The focus of attention shifts from groups to algebras with the following 
discovery. The tangent map from a Lie group G to its Lie algebra g sends 
normal subgroups of G to substructures of g called ideals. Thus the ideals 
of g “detect” normal subgroups of G in the sense that a nontrivial ideal of 
g implies a nontrivial normal subgroup of G. 

Lie algebras with no nontrivial ideals, like groups with no nontrivial 
normal subgroups, are called simple. It is not quite true that simplicity of 
g implies simplicity of G, but it turns out to be easier to recognize simple 
Lie algebras, so we consider that problem first. 

We prove simplicity for the “generalized rotation” Lie algebras so (n) 
for n > 4, su(«), sp(n), and also for the Lie algebra of the special linear 
group of C". The proofs occupy quite a few pages, but they are all vari- 
ations on the same elementary argument. It may help to skip the details 
(which are only matrix computations) at first reading. 
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6.1 Normal subgroups and ideals 

In Chapter 5 we found the tangent spaces of the classical Lie groups: the 
classical Lie algebras. In this chapter we use the tangent spaces to find 
candidates for simplicity among the classical Lie groups G. We do so by 
finding substructures of the tangent space g that are tangent spaces of the 
normal subgroups of G. These are the ideals , 4 defined as follows. 

Definition. An ideal 1) of a Lie algebra g is a subspace of g closed under 
Lie brackets with arbitrary members of g. That is, if Y G fj and X£g then 

[x,Y}et). 

Then the relationship between normal subgroups and ideals is given by 
the following theorem. 

Tangent space of a normal subgroup. If H is a normal subgroup of a 
matrix Lie group G, then T\ (H) is an ideal of the Lie algebra T\ ( G). 

Proof. T\(H) is a vector space, like any tangent space, and it is a subspace 
of 7i(G) because any tangent to H at 1 is a tangent to G at 1. Thus it 
remains to show that T\ (H) is closed under Lie brackets with members of 
T\ (G) . To do this we use the property of a normal subgroup that B G H and 
A6G implies ABA~ X G H. 

It follows that A(s)B(t)A(s)~ l is a smooth path in H for any smooth 
paths A(s) in G and B(t) in H. As usual, we suppose A(0) = 1 = B{ 0), so 
A'(0) = X G 7i(G) and B'( 0) = Y G T^H). If we let 

C s {t)=A{s)B(t)A{s)-\ 

then it follows as in Section 5.4 that 

D(s) =C' S (0) = A(s)TA(s) _1 

4 This terminology comes from algebraic number theory, via ring theory. In the 1840s, 
Kummer introduced some objects he called “ideal numbers” and “ideal primes” in order to 
restore unique prime factorization in certain systems of algebraic numbers where ordinary 
prime factorization is not unique. Rummer's “ideal numbers” did not have a clear meaning 
at first, but in 1871 Dedekind gave them a concrete interpretation as certain sets of numbers 
closed under sums, and closed under products with all numbers in the system. In the 1920s, 
Emmy Noether carried the concept of ideal to general ring theory. Roughly speaking, a 
ring is a set of objects with sum and product operations. The sum operation satisfies the 
usual properties of sum (commutative, associative, etc.) but the product is required only 
to “distribute” over sum: a(b + c) = ab + ac. A Lie algebra is a ring in this general sense 
(with the Lie bracket as the “product” operation), so Lie algebra ideals are included in the 
general concept of ideal. 
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is a smooth path in T\ (H). It likewise follows that 
D'(0) =XY -YX <E Ti(H), 

and hence T\(H) is an ideal, as claimed. □ 

Remark. In Section 7.5 we will sharpen this theorem by showing that 
T\(H) / {0} provided H is not discrete , that is, provided there are points 
in H not equal to 1 but arbitrarily close to it. Therefore, if q has no ideals 
other than itself and {0}, then the only nontrivial normal subgroups of G 
are discrete. We saw in Section 3.8 that any discrete normal subgroup of 
a path-connected group G is contained in Z(G), the center of G. For the 
generalized rotation groups G (which we found to be path-connected in 
Chapter 3, and which are the main candidates for simplicity), we already 
found Z(G) in Section 3.7. In each case Z(G) is finite, and hence discrete. 

This remark shows that the Lie algebra g = T\ (G) can “see” normal 
subgroups of G that are not too small. T\(G) retains an image of a normal 
subgroup H as an ideal T\ (//), which is “visible” (7) (H) f {0}) provided 
H is not discrete. Thus, if we leave aside the issue of discrete normal 
subgroups for the moment, the problem of finding simple matrix Lie groups 
essentially reduces to finding the Lie algebras with no nontrivial ideals. 

In analogy with the definition of simple group (Section 2.2), we define 
a simple Lie algebra to be one with no ideals other than itself and {0}. 
By the remarks above, we can make a big step toward finding simple Lie 
groups by finding the simple Lie algebras among those for the classical 
groups. We do this in the sections below, before returning to Lie groups to 
resolve the remaining difficulties with discrete subgroups and centers. 

Simplicity of so (3) 

We know from Section 2.3 that SO(3) is a simple group, so we do not 
really need to investigate whether so(3) is a simple Lie algebra. However, 
it is easy to prove the simplicity of so(3) directly, and the proof is a model 
for proofs we give for more complicated Lie algebras later in this chapter. 

First, notice that the tangent space so(3) of SO(3) at 1 is the same as 
the tangent space su(2) of SU(2) at 1. This is because elements of SO(3) 
can be viewed as antipodal pairs ±r/ of quaternions q in SU(2). Tangents 
to SU(2) are determined by the q near 1, in which case — q is not near 1, 
so the tangents to SO (3) are the same as the tangents to SU(2). 
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Thus the Lie algebra so(3) equals su(2), which we know from Section 
4.4 is the cross-product algebra on M 3 . (Another proof that so(3) is the 
cross-product algebra on M 3 is in Exercises 5.2. 1-5.2. 3.) 

Simplicity of the cross-product algebra. The cross-product algebra is 
simple. 

Proof. It suffices to show that any nonzero ideal equals M 3 = Mi+Rj + Mk, 
where i, j, and k are the usual basis vectors for M 3 . 

Suppose that 3 is an ideal, with a nonzero member it = xi + yj + zk. 
Suppose, for example, that x / 0. By the definition of ideal, 3 is closed 
under cross products with all elements of M 3 . In particular, 


u x j = xk — d £ 3, 


and hence 

(xk — d) x i = x) e 3. 

Then x~ 1 (xj) = j E 3 also, since J is a subspace. It follows, by taking cross 
products with k and i, that i,k E 3 as well. 

Thus 3 is a subspace of M 3 that includes the basis vectors i, j, and k, 
so 3 = M 3 . There is a similar argument if y / 0 or z / 0, and hence the 
cross-product algebra on M 3 is simple. □ 

The algebraic argument above — nullifying all but one component of 
a nonzero element to show that a nonzero ideal 3 includes all the basis 
vectors — is the model for several simplicity proofs later in this chapter. The 
later proofs look more complicated, because they involve Lie bracketing 
of a nonzero matrix to nullify all but one basis element (which may be a 
matrix with more than one nonzero entry). But they similarly show that a 
nonzero ideal includes all basis elements, and hence is the whole algebra, 
so the general idea is the same. 

Exercises 

Another way in which 7i(G) may misrepresent G is when T\ (H) = l\ (G) but H 
is not all of G. 

6.1.1 Show that 7i(0(n)) = 7i(SO(n)) for each n, and that SO(«) is a normal 
subgroup of O(n). 

6.1.2 What are the cosets of SO(n) in O («)? 
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An example of a matrix Lie group with a nontrivial normal subgroup is U(n). 
We determined the appropriate tangent spaces in Section 5.3. 

6.1.3 Show that SU(n) is a normal subgroup of U(n) by describing it as the kernel 
of a homomorphism. 

6.1.4 Show that 7i(SU(n)) is an ideal of T\ (U(n)) by checking that it has the 
required closure properties. 


6.2 Ideals and homomorphisms 

If we restrict attention to matrix Lie groups (as we generally do in this 
book) then we cannot assume that every normal subgroup H of a Lie group 
G is the kernel of a matrix group homomorphism G — > G/H. The problem 
is that the quotient G/H of matrix groups is not necessarily a matrix group. 
This is why we derived the relationship between normal subgroups and 
ideals without reference to homomorphisms. 

Nevertheless, some important normal subgroups are kernels of matrix 
Lie group homomorphisms. One such homomorphism is the determinant 
map G — ► C x , where C x denotes the group of nonzero complex numbers 
(or 1 x 1 nonzero complex matrices) under multiplication. Also, any ideal 
is the kernel of a Lie algebra homomorphism — defined to be a map of 
Lie algebras that preserves sums, scalar multiples, and the Lie bracket — 
because in fact any Lie algebra is isomorphic to a matrix Lie algebra. 

An important Lie algebra homomorphism is the trace map, 

Tr(A) = sum of diagonal elements of A, 

for real or complex matrices A. We verify that Tr is a Lie algebra homo- 
morphism in the next section. 

The general theorem about kernels is the following. 

Kernel of a Lie algebra homomorphism. If (p : g — »• g' is a Lie algebra 
homomorphism, and 


t) = {X G 0 : <P(X) = 0} 


is its kernel, then h is an ideal of g. 
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Proof. Since <p preserves sums and scalar multiples, 1) is a subspace: 

X\,X 2 €t)=> (p(Xi) = 0,<p(X 2 ) = 0 

=>• (p(X i -\-Xi) = 0 because (p preserves sums 
^X t +X 2 €^ 
xgH «p(x) = o 

=>• ccp(X) = 0 

=>• <p(cX) = 0 because (p preserves scalar multiples 
=>cX€t). 

Also, 1) is closed under Lie brackets with members of g because 

xeH <p(x) = o 

=>• <P(M) = [<P(X)MY)} = [0, <p(K)] = 0 

for any Leg because cp preserves Lie brackets 
X,Y]ei) for any Leg. 

Thus 1) is an ideal, as claimed. □ 

It follows from this theorem that a Lie algebra is not simple if it admits 
a nontrivial homomorphism. This points to the existence of non-simple Lie 
algebras, which we should look at first, if only to know what to avoid when 
we search for simple Lie algebras. 

Exercises 

There is a sense in which any homomorphism of a Lie group G “induces” a homo- 
morphism of the Lie algebra T\ (G). We study this relationship in some depth in 
Chapter 9. Here we explore the special case of the det homomorphism, assuming 
also that G is a group for which exp maps 7] (G) onto G. 

6.2.1 If we map each X (E 7’i (G) to Tr(X), where does the corresponding member 
e x of G go? 

6.2.2 If we map each e x G G to det(e x ), where does the corresponding A £ T\ (G) 
go? 


6.2.3 In particular, why is there a well-defined image of X when e x = e x ' ? 
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6.3 Classical non-simple Lie algebras 

We know from Section 2.7 that SO(4) is not a simple group, so we expect 
that so(4) is not a simple Lie algebra. We also know, from Section 5.6, 
about the groups GL(/i. C) and their subgroups SL(n.C). The subgroup 
SL(«,C) is normal in GL(/i. C) because it is the kernel of the homomor- 
phism 

det : GL(n,C) — > C x . 

It follows that GL(«, C) is not a simple group for any n, so we expect that 
gl(n, C) is not a simple Lie algebra for any n. We now prove that these Lie 
algebras are not simple by finding suitable ideals. 

An ideal in g[(/ 2 ,C) 

We know from Section 5.6 that gl(n,C) = M„( C) (the space of all n x n 
complex matrices), and s[(«,C) is the subspace of all matrices in M n (C) 
with trace zero. This subspace is an ideal, because it is the kernel of a Lie 
algebra homomorphism. 

Consider the trace map 

Tr : Af„(C) — >C. 

The kernel of this map is certainly sl(«,C), but we have to check that this 
map is a Lie algebra homomorphism. It is a vector space homomorphism 
because 

Tr(X + F) = Tr(X) + Tr(F) and Tr(zX) = zTr(X) for any C, 

as is clear from the definition of trace. 

Also, if we view C as the Lie algebra with trivial Lie bracket [m,v] = 
uv — vu = 0, then Tr preserves the Lie bracket. This is due to the (slightly 
less obvious) property that Tr(XF) = Tr(FX), which can be checked by 
computing both sides (see Exercise 5.3.8). Assuming this property of Tr, 
we have 

Tr ([X,F]) = Tr(XF — YX) 

= Tr(XF) - Tr(FA) 

= 0 

= [Tr(X),Tr(F)]. 

Thus Tr is a Lie bracket homomorphism and its kernel, sl(n,C), is neces- 
sarily an ideal of M„( C) = g[(n,C). 


6.3 Classical non-simple Lie algebras 
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An ideal in so(4) 

In Sections 2.5 and 2.7 we saw that every rotation of H = M 4 is a map of 
the form q i— > v~ l qw, where u,v e Sp(l) (the group of unit quaternions, 
also known as SU(2)). In Section 2.7 we showed that the map 

O : Sp(l) x Sp(l) — > SO(4) 

that sends (v, w) to the rotation q i— » v~ l qw is a 2-to-l homomorphism onto 
SO(4). This is a Lie group homomorphism, so by Section 6.1 we expect it 
to induce a Lie algebra homomorphism onto so (4), 

(p :sp(l) xsp(l) -»so(4), 

because sp(l) x sp(l) is surely the Lie algebra of Sp(l) x Sp(l). Indeed, 
any smooth path in Sp(l) x Sp(l) has the form u(t) = (v(t),w(t)), so 

«'(0) = (v'(0),w'(0)) Gsp(l) xsp(l). 

And as (v(t), w(t)) runs through all pairs of smooth paths in Sp(l) x Sp(l), 
(v / (0),w / (0)) runs through all pairs of velocity vectors in sp(l) x sp(l). 

Moreover, the homomorphism <p is 1-to-l. Of the two pairs (v(t).w(t)) 
and (— v(f), — w(t)) that map to the same rotation q i— > v(t)~ l qw(t), exactly 
one goes through the identity 1 when t = 0 (the other goes through —1). 
Therefore, the two pairs between them yield only one velocity vector in 
sp(l) xsp(l), either (v / (0),w / (0)) or (— v'(0), — w^O)). Thus <p is in fact 
an isomorphism of sp(l) x sp(l) onto so (4). (For a matrix description of 
this isomorphism, see Exercise 6.5.4.) 

But sp(l) x sp(l) has a homomorphism with nontrivial kernel, namely, 

(v'(0),w / (0)) i— s- (OjW^O)), with kernel sp(l) x {0}. 

The subspace sp(l) x {0} is therefore a nontrivial ideal of so (4). Since 
sp(l) is isomorphic to so(3), and so(3) x {0} is isomorphic to so(3), this 
ideal can be viewed as an so (3) inside so (4). 

Exercises 

A more concrete proof that st(n,C) is an ideal of g[(n,C) can be given by checking 
that the matrices in st(n,C) are closed under Lie bracketing with any member of 
gl(«,C). In fact, the Lie bracket of any two elements of g[(n,C) lies in si («,C), 
as the following exercises show. 
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We let 


An 

*12 ■ 

■ *ln\ 

xn 

*22 ■ 

•*•2 n 

\*nl 

Xn2 

Xnn ) 


be any element of g[(n,C), and consider its Lie bracket with e l; , the matrix with 
1 as its (7. y) -entry and zeros elsewhere. 


6.3.1 Describe Ae, ; and e,yA . Hence show that the trace of [X, e,y] is xp — xp = 0. 

6.3.2 Deduce from Exercise 6.3.1 that Tr([A,T]) = 0 for any A, 7 G g[(n,C). 


6.3.3 Deduce from Exercise 6.3.2 that sf(n,C) is an ideal of g[(n,C). 


Another example of a non-simple Lie algebra is u(n), the algebra of nx n 
skew-hermitian matrices. 


6.3.4 Find a 1-dimensional ideal 3 in u(n), and show that 3 is the tangent space 
of Z(U(n)). 

6.3.5 Also show that the Z(U(«)) is the image, under the exponential map, of the 
ideal 3 in Exercise 6.3.4. 


6.4 Simplicity of s l(n, C) and su(/i) 

We saw in Section 5.6 that sl(n,C) consists of all n x n complex matrices 
with trace zero. This set of matrices is a vector space over C, and it has 
a natural basis consisting of the matrices e,j for i / j and e,-,- — e„„ for 
i = 1,2, ... ,72 — 1, where e ;; - is the matrix with 1 as its (/, / ) -entry and zeros 
elsewhere. These matrices span sl(n,C). In fact, for any X G sifn.C), 

n— 1 

X — (x/y ) — XiiiXii ®/m) 

¥J i=i 

because x nn = — xn —X 22 -Xh-i.h -1 for the trace ofX to be zero. Also, 

X is the zero matrix only if all the coefficients are zero, so the matrices e,y 
for i / j and e,,- — e,„, for i = 1 ,2, ..../; —1 are linearly independent. 

These basis elements are convenient for Lie algebra calculations be- 
cause the Lie bracket of any X with an e,y has few nonzero entries. This 
enables us to take any nonzero member of an ideal 3 and manipulate it 
to find a nonzero multiple of each basis element in 3, thus showing that 
si(n,C) contains no nontrivial ideals. 


6.4 Simplicity of si (n, C) and su(zi) 
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Simplicity of si (zi, C). For each n, si (zz, C) z.v a simple Lie algebra. 

Proof. If X = (x/y) is any n x n matrix, then X e/y has all columns zero 
except the jth, which is occupied by the z'th column of X , and — e/yX has 
all rows zero except the ith, which is occupied by —(row j) of X. 
Therefore, since [X,e,-y] = Xe ;/ — e ;/ X, we have 


/ x i< \ 


column yof[X,e/y] 


X H ~ x jj > 

V/ • I ./ 


\ %ni ) 


and 


rOW Z of X . C/yJ — ( Xji ... Xjj— l Xjj Xjj Xj.j ■ I ... Xyn ) , 


and all other entries of [X,e/y] are zero. In the (z, j) -position, where the 
shifted row and column cross, we get the element xu — Xjj. 

We now use such bracketing to show that an ideal 3 with a nonzero 
member X includes all the basis elements of sl(n,C), so 3 = sl(zi,C). 


Case (i): X has nonzero entry xjj for some z / j. 

Multiply [X , e,y] by e,y on the right. This destroys all columns except 
the z'th, whose only nonzero element is —xji in the (z, imposition, moving it 
to the (z, y')-position (because column i is moved to column j position). 

Now multiply [X.e/y] by — e/y on the left. This destroys all rows except 
the /th, whose only nonzero element is xy; at the (./• /(-position, moving it 
to the (z, /)-position and changing its sign (because row j is moved to row 
i position, with a sign change). 

It follows that [X , e/y] e/y — e,y [X , e,y] = [[X , e/y] , e/y] contains the nonzero 
element — 2jcy,- at the (/. /(-position, and zeros elsewhere. 

Thus the ideal 3 containing X also contains e/y. By further bracket- 
ing we can show that all the basis elements of s((n,C) are in 3. For 
a start, if e/y G 3 then ey/ G 3, because the calculation above shows that 
[[e/y,ey;],ey/] = —2 ey/. The other basis elements can be obtained by using 
the result 
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which can be checked by matrix multiplication (Exercise 6.4.1). 

For example, suppose we have and we want to get 643. This is 
achieved by the following pair of bracketings, from right and left: 


[ei2jC23] — ®13 > 

[©41 ,©13] = e 43- 

All tki with k / / are obtained similarly. Once we have all of these, we 
obtain the remaining basis elements of sl(«,C) by 

[®/n j ®m] — ®«n • 

Case (ii). All the nonzero entries of X are among jcn ,*22, • • ■ ,x nn . 

Not all these elements are equal (otherwise, Tr(X) / 0), so we can 
choose i and j such that x„ — Xjj / 0. Now, for this X, the calculation of 
[X,e, 7 ] gives 

[X,e, 7 ] — (xa Xjj)tij. 

Thus 3 includes a nonzero multiple of e, 7 , and hence e , 7 itself. We can 
now repeat the rest of the argument in case (i) to conclude again that 3 = 
sl(n,C), sos((n,C) is simple. □ 

An easy corollary of this result is the following: 

Simplicity of su(n). For each n, s u(n) is a simple Lie algebra. 

Proof. We use the result from Section 5.6, that 

s[(«,C) = su(n) + isu(n) = {A + iB : A,B € su(7i)}. 

It follows that if 3 is a nontrivial ideal of su (n) then 

J T (J — {C T iD : C,D G 3 } 

is a nontrivial ideal of C). One only has to check that 

1. 3 + O is not all of sl(n, C), which is true because of the 1-to-l corre- 
spondence X = Xj T iX 2 between elements X of sl(n. C) and ordered 
pairs (Xj ^2) such that X\ ,X2 G su (n). 

If 3 + /3 includes each X G s((n, C) then 3 includes each Xj G su(n), 
contrary to the assumption that 3 is not all of su (n). 


6.5 Simplicity of so (n) for n > 4 
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2. 3 + iJ is a vector subspace (over C) of sl(n. C). Closure under sums 
is obvious. And the scalar multiple ( a + ib)(C + if)) of any C + if) 
in 3 + /3 is also in 3 + ;3 for any a + ib G C because 

( a + ib) (C + iD) = (aC — dD) + i(bC + nD) 
and aC — bD,bC + nD G 3 by the vector space properties of 2. 

3. 3 + D is closed under the Lie bracket with any A + iB € sl(n,C). 
This is because, if C + iD 6 T + D, then 

[C + /D,A + /B] = [C,A]-[D,B] + i([D,A] + [C,S])G3 + 0 
by the closure properties of 3. 

Thus a nontrivial ideal 3 of su(n) gives a nontrivial ideal of sl(n, C). There- 
fore 3 does not exist. □ 


Exercises 


6.4.1 Verify that 


[ c iji e jk] 


£ik if i 7^ bj 

e/, — Cjj if i = k. 


6.4.2 More generally, verify that [e^.e^/] = d^e,/ — 5;,e^. 


In Section 6.6 we will be using multiples of the basis vectors e mm by the quaternion 
units i, j, and k. Here is a taste of the kind of result we require. 

6.4.3 Show that [i(e^p ^qq ) — 2k(epp-t-e^). 

6.4.4 Show that an ideal of quaternion matrices that includes ie mm also includes 
j e m m and ke mm • 


6.5 Simplicity of so (n) for n > 4 

The Lie algebra s o(n) of real n x n skew-symmetric matrices has a basis 
consisting of the n(n — 1) matrices 

K, 7 - C/j-Cji for i < j. 

Indeed, since E, 7 has 1 in the (i,j ) -position and —1 in the (y,/) -position, 
any skew symmetric matrix is uniquely expressible in the form 

A = X x ij^‘j ■ 
i<j 
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Our strategy for proving that so (n) is simple is like that used in Section 6.4 
to prove that sl(n, C) is simple. It involves two stages: 

• First we suppose that X is a nonzero member of some ideal 3 and 
take Lie brackets of X with suitable basis vectors until we obtain a 
nonzero multiple of some basis vector in 3. 

• Then, by further Lie bracketing, we show that all basis vectors are 
in fact in 3, so 3 = s o{n). 

The first stage, as with sl(n,C), selectively nullifies rows and columns until 
only a nonzero multiple of a basis vector remains. It is a little trickier to 
do this for so(n), because multiplying by E/y leaves intact two columns 
(or rows, if one multiplies on the left), rather than one. To nullify all but 
two, symmetrically positioned, entries we need n > 4, which is no surprise 
because so (4) is not simple. 

In the first stage we need to keep track of matrix entries as columns 
and rows change position, so we introduce a notation that provides number 
labels to the left of rows and above columns. For example, we write 


J 


( 


\ 


1 


j -1 

V 




to indicate that E/y has 1 in the (/', /) -posit ion, —1 in the (y, ^-position, and 
zeros elsewhere. 

Now suppose X is the n x n matrix with (/, /(-entry jc/y. Multiplying X 
on the right by E/y and on the left by —E/y, we find that 


XE ij = 


i J 

( ~ x > j x u \ 

~X2 j *2 i 



6.5 Simplicity of so (n) for n > 4 
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and 


-E ijX = 


~ x n 

-Xp_ •• 

x jn 

Xil 

Xi2 

Xin 


V 




Thus, right multiplication by E, ; - preserves only column i, which goes to 
position j, and column j, which goes to position i with its sign changed. 
Left multiplication by — E/y preserves row i, which goes to position j, and 
row j, which goes to position i with its sign changed. 

The Lie bracket of X with E,y is the sum of XE jj and — E ,-yX, namely 

[XVij] = 



( 

-xi j 

x u 

\ 



~X2j 

X 2 i 


i 

-Xji -Xj 2 ••• 

% ji j 

1 

+ ' 

— X jn 

j 

xn x a 

%ii Xjj 

x ij + Xji 

Xin 


V 

x nj 

Xni 


Note that the ( i , j)- and (y, i 

-entries are zero when X G so(n] 

because x- t 


xjj = 0 in a skew-symmetric matrix. Likewise, the (/,/)- and ( /. /)-entries 
are zero for a skew-symmetric X, so for X S so(n) we have the simpler 
formula (*) below. In short, the rule for bracketing a skew-symmetric X 
with E ij is: 


• Exchange rows i and j, giving the new row i a minus sign. 

• Exchange columns i and j, giving the new column i a minus sign. 

• Put 0 where the new rows and columns meet and 0 everywhere else. 
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[*,E ij] = 




l 

J 

\ 



~ X 1 j 

Xu 



~X2j 

X2i 


-Xjl 

-Xjl •• 

0 • 

■ 0 •• 

x i n 

Xil 

Xi2 

0 • 

• 0 •• 

%in 



x nj 

Xfii 

/ 


(*) 


We now make a series of applications of formula (*) for [WE,,] to 
reduce a given nonzero X G so (n) to a nonzero multiple of a basis vector. 
The result is the following theorem. 


Simplicity of so(«). For each n > 4, so (n) is a simple Lie algebra. 


Proof. Suppose that 3 is a nonzero ideal of so(n), and that X is a nonzero 
n x n matrix in 3. We will show that 3 contains all the basis vectors E so 
3 = s o(n). 

In the first stage of the proof, we Lie bracket X with a series of four 
basis elements to produce a matrix (necessarily skew-symmetric) with just 
two nonzero entries. The first bracketing produces the matrix X\ = [X,Ey] 
shown in (*) above, which has zeros everywhere except in columns i and j 
and rows i and j. 

For the second bracketing we choose a k^i,j and form A3 = [A) ,E , 
which has row and column j of X\ moved to the k position, row and column 
k of X\ moved to the j position with their signs changed, and zeros where 
these rows and columns meet. Row and column k in Aj = [A3 E (/ ] have 
at most two nonzero entries (where they meet row and column i and j), 
so row and column j in A3 = \X\ . Ey,.] each have at most one, since the 
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( /. /) -entry —xm —xm is necessarily zero. The result is that 


/ 


xu \ 

X2i 


[XuE jk } = 

j 


Xjk o 

x k j 0 0 


k 


Xil Xj 2 


0 


0 • • • 0 • • • Xi„ 


V 


%ni 


/ 


Now choose 1 / i,j,k and bracket Xi = \X\ . E j k ] with E//. The only 
nonzero elements in row and column / of X 2 are xu at position (Lk) in row 
l and xu at position (k. l) in column k. Therefore, X 3 = is given by 


i j k 1 
( \ 


[X 2 ,En] = j 
k 
l 


-xu 


x kj 


-xu 


Xjk 


To complete this stage we choose m / i , j,k, l and bracket X 3 = [AVE,/] 
with E/ m . Since row and column m are zero, the result X 4 = [X^.E/,,,] is the 
matrix with x k j in the ( /.mj-position and xjk in the (m, j) -position; that is, 


[-^3 j — XkjEj m . 


Now we work backward. If X is a nonzero member of the ideal 3, let x k j 
be a nonzero entry of X. Provided n > 4, we can choose i / j,k, then 
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/ / i,j,k and m / i,j,k,l, and construct the nonzero element x k jEj m of 3 
by a sequence of Lie brackets as above. Finally, we multiply by 1 /x k j and 
obtain E /m e 3. 

The second stage obtains all other basis elements of so(«) by forming 
Lie brackets of E j m with other basis elements. This proceeds exactly as 
for s[(n, C), because the E; ; satisfy relations like those satisfied by the e,; ; , 
namely 


[Eij,Ej k ] = E ik if i^k, 

[Ep'jEfo-] = Ej k if j^k. 

Thus, when n > 4, any nonzero ideal of 5o(n) is equal to s o(n), as 
required. □ 

The first stage of the proof above may seem a little complicated, but 
I doubt that it can be substantially simplified. If it were much simpler it 
would be wrong! We need to use five different values i,j,k,l,m because 
so (4) is not simple, so the result is false for a 4 x 4 matrix X. 

Exercises 

6.5.1 Prove that 


K,,.K,s] K« if i^k, 

[Eij,E ki ] = E jk if j^k. 

6.5.2 Also show that [Ey,E*/] = 0 if i- j.k.l are all different. 

6.5.3 Use Exercises 6.5.1 and 6.5.2 to give another proof that [W 3 , E/,„] = 

(Hint: Write A 3 as a linear combination of E^ and E ; /.) 

6.5.4 Prove that each 4x4 skew-symmetric matrix is uniquely decomposable as 
a sum 


/0 —a —b —c^ 


0 

1 

X 

1 

1 

a 0 — c b 

1 

x 0 

z -y 

b c 0 — a 

~r 

y — z 0 x 

\c —b a 0 j 


\z y 

1 

X 

O 


6.5.5 Setting I = — E 12 — E34, J = — E13 +E24, and K = — E14 — E23, show that 
[I,J] = 2 K, [J,K] = 21, and [K,I] = 2 J. 

6.5.6 Deduce from Exercises 6.5.4 and 6.5.5 that so (4) is isomorphic to the direct 
product 5o(3) x so(3) (also known as the direct sum and commonly written 
so(3) ©so(3)). 


6.6 Simplicity of sp(rc) 


133 


6.6 Simplicity of sp(n ) 


If X G sp(n) we have X + X T = 0 , where X is the result of replacing each 
entry in the matrix X by its quaternion conjugate. Thus, if X = ( x,y ) and 

Xi j — cijj T b i ji + a / j T di yk, 


then 


and hence 


Xjj — cijj bjji Cijj d/yk 
Xji — ct[ j T bj ji T Cj / j + dj yk, 


where a,y . bij , cij . d/j € M. (And, of course, the quaternion units i, j, and 
k are completely unrelated to the integers i, j used to number rows and 
columns.) In particular, each diagonal entry xu of X is pure imaginary. 

This gives the following obvious basis vectors for sp(n) as a vector 
space over M. The matrices e,,- and E (/ - are as in Sections 6.4 and 6.5. 


• For i = 1,2, ... the matrices ie ;( , je,v, and ke,y. 


• For each pair (/, /) with i < j , the matrices E/y. 

• For each pair ( i, j ) with i < j, the matrices iE^, jE ij, and kE/y, where 
E jj is the matrix with 1 in the (/, / )-position, 1 in the (y,/) -position, 
and zeros elsewhere. 


To prove that sp(7i) is simple we suppose that T is an ideal of sp(«) with 
a nonzero element X = (x/y). Then, as before, we reduce X to an arbitrary 
basis element by a series of Lie bracketings and vector space operations. 
Once we have found all the basis elements in J, we know that 3 = sp(/i). 
We have a more motley collection of basis elements than ever before, but 
the job of finding them is made easier by the presence of the very simple 
basis elements ie,y, je,y, and ke„. 

In particular, 3 includes 
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and hence also, if i / j, 


J 


( 


\ 


[[X,iefi],ie w ] 


J -vcjfl 


V 




where all entries are zero except those explicitly shown. 

This gets the essential matrix calculations out of the way, and we are 
ready to prove our theorem. 

Simplicity of sp(/i). For all n, sp (n) is a simple Lie algebra. 

Proof. When n= 1, we have sp(l) = su(2), which we proved to be simple 
in Section 6.4. Thus we can assume n > 2, which allows us to use the 
computations above. 

Suppose that 3 is an ideal of sp(n), with a nonzero element X = (xij). 
Case (i). All nonzero entries x,y of X are on the diagonal. 

In this case (*) gives the element of 3 


\X . ie,,] — (x,3 ix a )e,-; , 


and we can similarly obtain the further elements 


— (■Tvj j Xjj)eu, 
[X , ke,-,] = (xiik-kxu)e u . 


Now if xu = bu i + cuj + d,,k we find 


x a i i Xu — 2c,,k + 2d ii') 

Xa.) j Xu — 2 h a k 2d p . 

Xjj k k xu — 2b „] -\- 2c,,] . 


So, by the closure of 3 under Lie brackets and real multiples, we have 
(-cuk + dii})tii, (b n k - 4'i)e (7 , {-b n \ + c, 7 i)e, 7 in 3. 

Lie bracketing these three elements with kl, il, jl respectively gives us 


d/'/ie,, . bujeu, 


cuke a in 3. 


6.6 Simplicity of sp(rc) 
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Thus if xu is a nonzero entry in X we have at least one of the basis vectors 
ie,,, je„, ke;; in 3. Lie bracketing the basis vector in 3 with the other two 
then gives us all three of ie,-;, je„ , ke„ in 3. (Here, the facts that jk = — kj = i 
and so on work in our favor.) 

Until now, we have found ie,,, je„, ke„ in 3 only for one value of i. To 
complete our collection of diagonal basis vectors we first note that 

[E; J , ie;;] = iE; j , [E; j , je;;] = jE; j , [E; j , ke;;] = kE; y - , (* *) 

as special cases of the formula (*). Thus we have 

iE,y, jE; j , kE;y. in 3 

for some i and arbitrary j / i. Then we notice that 

[iE, 7 ,jE, 7 ] = 2k(e;,- + e ;7 ). 

So k(e,-; + e /7 ) and ke„ are both in 3, and hence their difference ke /7 is in 
3, for any j. We then find ie /7 in 3 by Lie bracketing je 77 - with ke /7 , and 
je /7 - in 3 by Lie bracketing ie ;7 with ke ;/ . 

Now that we have the diagonal basis vectors ie,;, je,;, ke,-,- in 3 for all 
i, we can reapply the formulas (**) to get the basis vectors ifi, 7 , jE, 7 , and 
kE, 7 for all i and j with i < j. Finally, we get all the E, 7 in 3 by the formula 

[iE, 7 ,ie,-,-] — E, 7 , 

which also follows from (*). Thus all the basis vectors of sp(n) are in 3, 
and hence 3 = sp(n). 

Case (ii). X has a nonzero entry of the form x, 7 = «, 7 + 6, 7 i + c, 7 j + d; 7 k, 
for some i < j. 

Our preliminary calculations show that the element [[X,ie;,-],ie i7 ] of 3 
has zeros everywhere except for — ix, 7 i in the (i. /(-position, and its nega- 
tive conjugate — ix ; -;i in the (/, /'(-position. Explicitly, the (/. /(-entry is 

ix; ji — ci jj T bjji c,- ,j djjk, 

so we have 

[]3C ie,-;] . ie 7 /] — rt;/E;/ T (f?, 7 i c, 7 j r/, 7 k)E, 7 G 3. 
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If ctij is the only nonzero coefficient in [[X,ie„],ie ;; ] we have E , 7 E 3. 
Then, writing E/y = e, 7 — e 7 ;, E, 7 = e, 7 + e /( -, we find from the formula 
e ;/ .e 7 /] = e ,7 — e 7/ - of Section 6.4 the following elements of 3: 

[E,y,iE, 7 ] = 2i(e„ — e 77 ), 

[E ;7 ,jE, 7 ] = 2j(e ; -; — e 77 ), 

[E i7 ,kE i7 ] = 2k(e ( ; — e y7 ). 

The first two of these elements give us 

[i( e » -e 77 ),j(e, 7 -e i7 )] = 2k (e /; + e ;7 ) E 3 

(Another big “thank you” to noncommutative quaternion multiplication!) 
Adding the last two elements found, we find ke„ E 3, so 3 = sp(/i) for the 
same reasons as in Case (i). 

Finally, if one of the coefficients Z?, 7 , c, 7 , or d/j is nonzero, we simplify 

а, 7 E, 7 + (bj/i — dji — dijk)Eij by Lie bracketing with il, jl, and kl. Since 

[E//,il] = 0, [iEy,il] = 0, [IEy,jl] = 2kE, 7 , 

and so on, we can nullify all terms in a i7 E, 7 + (/?, 7 i — c, 7 j + d, 7 k)E, 7 except 
one with a nonzero coefficient. This gives us, say, iE, 7 E 3. Then we apply 
the formula 

[iE/y . ie/,] — E/y , 

which follows from (*), and we again have E i7 E 3, so we can reduce to 
Case (i) as above. □ 

Exercises 

It was claimed in Section 5.7 that sp(n) is “simpler” than the Lie algebra £([(«, H) 
of all n x n quaternion matrices. What was meant is that 0 l(n,EI) is not a simple 
Lie algebra — it contains two nontrivial ideals: 

94 = {X : X = r\ for some f el} of dimension 1 , 

T = {X : re(Tr(A)) = 0} of dimension 4n 2 — 1 , 

where re denotes the real part of the quaternion. 

б. 6.1 Prove that 94 is an ideal of gl(n,H). 

6.6.2 Prove that, for any two quaternions p and q, re(pq) = re(qp). 
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6 . 6.3 Using Exercise 6.6.2 or otherwise, check that T is an ideal of the real Lie 
algebra g[(n,H). 

6 . 6.4 Show that each X £ gl(n,H) has a unique decomposition of the form X = 
R+T, where R £ 93 and T £ T. 

It turns out that 93 and T are the only nontrivial ideals of g((n. IHl). This can be 
shown by taking the 4 ir basis vectors e,y, ie,,. je (; , ke, ; for g[(«,EI), and consid- 
ering a nonzero ideal 3. 

6 . 6.5 If 3 has a member X with a nonzero entry Xij, where i / j, show that 3 
equals T or gl(n,H). 

6.6.6 Show in general that 3 equals either 93, T, or g[(n. LHI). 


6.7 Discussion 

As mentioned in Section 5.8, the classical simple Lie algebras were known 
to Lie in the 1880s, the exceptional simple algebras were discovered by 
Killing soon thereafter, and by 1894 Cartan had completely settled the 
question by an exhaustive proof that they are the only exceptions. The 
number of exceptional algebras, in complex form, is just five. All this be- 
fore it was realized that Lie algebras are quite elementary objects! (namely, 
vector spaces of matrices closed under the Lie bracket operation). It has 
been truly said that the Killing-Cartan classification of simple Lie alge- 
bras is one of the great mathematical discoveries of all time. But it is not 
necessary to use the sophisticated theory of “root systems,” developed by 
Killing and Cartan, merely to prove that the classical algebras so (n), su(n), 
and sp(n) are simple. As we have shown in this chapter, elementary matrix 
calculations suffice. 

The matrix proof that sf(n,C) is simple is sketched in Carter et al. 
[1995], p. 10, and the simplicity of su (n) follows from it, but I have 
nowhere seen the corresponding elementary proofs for so (n) and sp (n). 
It is true that the calculations become a little laborious, but it is not a good 
idea to hide all matrix calculations. Many results were first discovered 
because somebody did such a calculation. 

The simplicity proofs in Sections 6.4 to 6.6 are trivial in the sense that 
they can be discovered by anybody with enough patience. Given that sp(n), 
say, is simple, we know that the ideal generated by any nonzero element 
X is the whole of sp(n). Therefore, if we apply enough Lie bracket and 
vector space operations to X, we will eventually obtain all the basis vectors 


138 


6 Structure of Lie algebras 


of sp (n). In other words, brute force search gives a proof that any nonzero 
ideal of sp (n) equals sp(n) itself. 

The Lie algebra so (4) is close to being simple, because it is the direct 
product so(3) x so(3) of simple Lie algebras. Direct products of simple 
Lie algebras are called semisimple. Sophisticated Lie theory tends to focus 
on the broader class of semisimple Lie algebras, where so (4) is no longer 
an anomaly. With this approach, one can also avoid the embarrassment of 
using the term “complex simple Lie algebras” for algebras such as sl(n,C), 
replacing it by the slightly less embarrassing “complex semisimple Lie al- 
gebras.” (Of course, the real mistake was to call the imaginary numbers 
“complex” in the first place.) 
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The matrix logarithm 


Preview 

To harness the full power of the matrix exponential we need its inverse 
function, the matrix logarithm function, log. Like the classical log, the 
matrix log is defined by a power series that converges only in a certain 
neighborhood of the identity. This makes results involving the logarithm 
more “local” than those involving the exponential alone, but in this chapter 
we are interested only in local information. 

The central result is that log and exp give a 1 -to- 1 correspondence, 
continuous in both directions, between a neighborhood of 1 in any matrix 
Lie group G and a neighborhood of 0 in its Lie algebra g = 7i(G). Thus 
the log function produces tangents. The proof relates the classical limit 
process defining tangents to the infinite series defining the logarithm. The 
need for limits motivates the definition of a matrix Lie group as a matrix 
group that is suitably closed under limits. 

The correspondence shows that elements of G sufficiently close to 1 
are all of the form e x , where X G g. When two such elements, e x and e ¥ , 
have a product of the form e z it is natural to ask how Z is related to X and 
Y . The answer to this question is the CampbeU-Baker-Hausdorff theorem, 
which says that Z equals an infinite sum of elements of the Lie algebra g, 
namely X + Y plus elements built from X and Y by Lie brackets. 

We give a very elementary, but little-known, proof of the Campbell- 
Baker-Hausdorff theorem, due to Eichler. The proof depends entirely on 
manipulation of polynomials in noncommuting variables. 


J. Stillwell, Naive Lie Theory , DOI: 10.1007/978-0-387-78214-0_7, 
© Springer Science+Business Media, LLC 2008 
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7.1 Logarithm and exponential 

Motivated by the classical infinite series 

^.3 yA 

log( 1 + x) = x — — + — — — H , valid for real x with |jc| < 1 , 

we define the logarithm of a square matrix 1 + A with |A| < 1 by 

, , A 2 A 3 A 4 

l0g(l+A) = A-y + y-— + 

This series is absolutely convergent (by comparison with the geometric se- 
ries) for |A| < 1, and hence log(l+A) is a well-defined continuous function 
in this neighborhood of 1. 

The fundamental property of the matrix logarithm is the same as that 
of the ordinary logarithm: it is the inverse of the exponential function. 
The proof involves a trick we used in Section 5.2 to prove that e A e B = 
e A+B when AB = BA. Namely, we predict the result of a computation with 
infinite series from knowledge of the result in the real variable case. 

Inverse property of matrix logarithm. For any matrix e x within distance 
1 of the identity, 

\og(e x )=X. 


Proof. Since e x = 1 + ^ + 


2 ! 


xl 

3 ! 


and | e x — 1 1 < 1 we can write 


log(> x ) = log ( 1 + 


'X X 2 

“ 1 TT + 2!" + ' 


X X 2 

TT + 2f + ‘" 

1 (X X 2 

“2 (tt + 3T + ' 


1 X X 1 

+ 3 VTT + 2T + ' 


by the definition of the matrix logarithm. Also, the series is absolutely 
convergent, so we can rearrange terms so as to collect all powers of X m 
together, for each m. This gives 


log(^)=X + 





1 1 

~~ 2 + 3 


X 3 + • • • . 


It is hard to describe the terms that make up the coefficient of X m , for 
arbitrary m > 1, hut we know that their sum is zero! Why? Because exactly 
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the same terms occur in the expansion of log(e*), when \e x — 1| < 1, and 
their sum is zero because log(e x ) = x under these conditions. 

Thus \og(e x ) = X as required. □ 

The inverse property allows us to derive certain properties of the matrix 
logarithm from corresponding properties of the matrix exponential. For 
example: 

Multiplicative property of matrix logarithm. If AB = BA , and log(A), 
log(B), and log (AS) are all defined, then 


log(Afi) = log(A) + log(fi). 


Proof. Suppose that log (A) = X and log(fi) = Y, so e x = A and e 1 = B by 
the inverse property of log. Notice that XY = YX because 

(A - l) 2 (A - l) 3 


X = log(l + (A — 1)) = (A — 1) — 
Y = log(l + (S — 1)) = (5 — 1) — 


2 3 

(B — l ) 2 (B- l ) 3 


and the series commute because A and B do. Thus it follows from the 
addition formula for exp proved in Section 5.2 that 

AB = e x e ¥ = e x+Y . 


Taking log of both sides of this equation, we get 

log(Afl) =X + Y = log (A) + log(fi) 

by the inverse property of the matrix logarithm again. □ 


Exercises 


The log series 


2 3 4 

, . XT X J X 

log 1 +X) = X 1 

sv t ; 2 3 4 


was first published by Nicholas Mercator in a book entitled Logarithmotechnia in 
1668. Mercator’s derivation of the series was essentially this: 


log(l+x)= f t^-= / (l-f + r-f 3 + ---) 

JO 1+t Jo 


dt = x — ■ 


x 

y 


X 

T 


Isaac Newton discovered the log series at about the same time, but took the idea 
further, discovering the inverse relationship with the exponential series as well. 
He discovered the exponential series by solving the equation y = log(l + x) as 
follows. 
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7.1.1 Supposing x = ao + a iy + fl 2 y 2 H (the function we call e y — 1), show that 

y= (a 0 + a\y + a 2 y 2 H ) 

— 2 ( a o + a iy + a iy~ h — ) 2 

+ \ i {ao + a x y + a 1 y 2 + ---) 3 ••• (*) 

7.1.2 By equating the constant terms on both sides of (*), show that ao = 0. 

7.1.3 By equating coefficients of y on both sides of (*), show that a i = 1. 

7.1.4 By equating coefficients of y 2 on both sides of (*), show that a 2 = 1 /2. 

7.1.5 See whether you can go as far as Newton, who also found that a 2 = 1 / 6 , 
«4 = 1 /24, and as = 1 / 120. 

Newton then guessed that a„= \/n\ “by observing the analogy of the series.” 
Unlike us, he did not have independent knowledge of the exponential function 
ensuring that its coefficients follow the pattern observed in the first few. 

As with exp, term-by-term differentiation and series manipulation give some 
familiar formulas. 

7.1.6 Prove that ^-log(l +At) = A(1 +Af) -1 . 


7.2 The exp function on the tangent space 

For all the groups G we have seen so far it has been easy to find a general 
form for tangent vectors A'(0) from the equation(s) defining the members 
A of G. We can then check that all the matrices X of this form are mapped 
into G by exp, and that e tX lies in G along with e x , in which case X is a 
tangent vector to G at 1. Thus exp solves the problem of finding enough 
smooth paths in G to give the whole tangent space T\ (G) = g. 

But if we are not given an equation defining the matrices A in G, we 
may not be able to find tangent matrices in the form A'(0) in the first place, 
so we need a different route to the tangent space. The log function looks 
promising, because we can certainly get back into G by applying exp to a 
value X of the log function, since exp inverts log. 

However, it is not clear that log maps any part of G into T\ (G ) , except 
the single point lEG. We need to make a closer study of the relation 
between the limits that define tangent vectors and the definition of log. 
This train of thought leads to the realization that G must be closed under 
certain limits, and it prompts the following definition (foreshadowed in 
Section 1.1) of the main concept in this book. 
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Definition. A matrix Lie group G is a group of matrices that is closed 
under nonsingular limits. That is, if A 1,^2, A3, . . . is a convergent sequence 
of matrices in G, with limit A, and if det(A) f 0, then A E G. 

This closure property makes possible a fairly immediate proof that exp 
indeed maps T\ (G) back into G. 

Exponentiation of tangent vectors. If A' { 0) is the tangent vector at 1 to 
a matrix Lie group G, then e A E G. That is, exp maps the tangent space 
T\ (G) into G. 

Proof. Suppose that A{t) is a smooth path in G such that A (0) = 1, and 
that A' if)) is the corresponding tangent vector at 1. By definition of the 
derivative we have 


A'(0) = lim 

A/—>0 


A(At) - 1 


At 


= lim 


A(l/n) — 1 

1 /« 


where n takes all natural number values greater than some no- We compare 
this formula with the definition of log A(1 / n ), 


logA(l/n) = (A(l/n) — 1) 


(A(l/n) — l) 2 (A(l/n) — l) 3 

2^3 


which also holds for natural numbers n greater than some hq. Dividing 
both sides of the log formula by 1/nwe get 


nlogA(l/n) = 

_ A(l/n)-l 
1 jn 


logA(l/n) 

1 In 

A(l/n) — 1 A(l/«) — 1 
1 / n 2 


(A(l/n) — l) 2 ■ 

3 


(*) 


Now, taking no large enough that |A(1 fn) — 1| < e < 1/2, the series in 

square brackets has sum of absolute value less than e + e 2 + £ 3 H < 2e, 

so its sum tends to 0 as n tends to It follows that the right side of (*) has 
the limit 

A'(0)— A'(0)[0] =A'(0) 

as n — ► 00. The left side of (*), nlogA(l/n), has the same limit, so 


A^O) = lim 7ilogA(l/n). 
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Taking exp of equation (**), we get 

e A '(°) — glim„_„nlogA(l/n) 

= lim e nlo s A ( l / n ) because exp is continuous 

n — >oo 

= lim fe logA (l/") V because / +B = e A e B when AB = BA 

W— >°o \ / 

= lim A(1 /n) n because exp is the inverse of log. 

n—>°° 

NowA(l/n) EGby assumption, soA(l/w)" EGbecauseGis closed under 
products. We therefore have a convergent sequence of members of G, and 
its limit e A f0; is nonsingular because it has inverse e~ A ! ° s . So e A E G, 
by the closure of G under nonsingular limits. 

In other words, exp maps the tangent space 71(G) = g into G. □ 

The proof in the opposite direction, from G into 71(G), is more subtle. 
It requires a deeper study of limits, which we undertake in the next section. 

Exercises 

7 . 2.1 Deduce from exponentiation of tangent vectors that 

7\(G) = {X : e tx € G for all t E M}. 

The property 7\(G) = {X : e ,x E G for all t E R} is used as a definition of 71(G) 
by some authors, for example Hall [2003]. It has the advantage of making it clear 
that exp maps 7] (G) into G. On the other hand, with this definition, we have to 
check that 71(G) is a vector space. 

7 . 2.2 Given X as the tangent vector to e tX , and Y as the tangent vector to e ,Y , 
show that X + Y is the tangent vector to A(f) = e ,x e tY . 

7 . 2.3 Similarly, show that if A is a tangent vector then so is rX for any r E R. 

The formula A'(0) = lim„^ M nlogA(l/n) that emerges in the proof above can 
actually be used in two directions. It can be used to prove that exp maps 71(G) 
into G when combined with the fact that G is closed under products (and hence 
under nth powers). And it can be used to prove that log maps (a neighborhood of 
1 in) G into 71(G) when combined with the fact that G is closed under nth roots. 

Unfortunately, proving closure under nth roots is as hard as proving that log 
maps into 71(G), so we need a different approach to the latter theorem. Never- 
theless, it is interesting to see how nth roots are related to the behavior of the log 
function, so we develop the relationship in the following exercises. 
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7.2.4 Suppose that, for each A in some neighborhood - Y of 1 in G, there is a 

smooth function A(f), with values in G, such that A(1 /n) = A 1 /" for n = 
1,2,3, Show that A' (0) = logA, so logA G 7\(G). 

7.2.5 Suppose, conversely, that log maps some neighborhood ,/L of 1 in G into 
7i(G). Explain why we can assume that JV is mapped by log onto an 
e-ball N £ (0) in 71(G). 

7.2.6 Taking JV as in Exercise 7.2.4, and A G Jlf , show that t log, 4 G 7) (G) for 

all t G [0, 1], and deduce that A 1 /" exists forn =1,2,3, 

7.3 Limit properties of log and exp 

In 1929, von Neumann created a new approach to Lie theory by confin- 
ing attention to matrix Lie groups. Even though the most familiar Lie 
groups are matrix groups (and, in fact, the first nonmatrix examples were 
not discovered until the 1930s), Lie theory began as the study of general 
“continuous” groups and von Neumann's approach was a radical simplifi- 
cation. In particular, von Neumann defined “tangents” prior to the concept 
of differentiability — going back to the idea that a tangent vector is the limit 
of a sequence of “chord” vectors — as one sees tangents in a first calculus 
course (Figure 7.1). 



Definition. X is a sequential tangent vector to G at 1 if there is a sequence 
(A m ) of members of G, and a sequence (a m ) of real numbers, such that 
A m — > 1 and (A m — 1) / a m as m — > 
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If A(t) is a smooth path in G with A (0) = 1, then the sequence of points 
A m = A(l/m) tends to 1 and 

A'(0) = lim A '” 1 , 

m — 1 j jfi 

so any ordinary tangent vector A'(0) is a sequential tangent vector. But 
sometimes it is convenient to arrive at tangent vectors via sequences rather 
than via smooth paths, so it would be nice to be sure that all sequential 
tangent vectors are in fact ordinary tangent vectors. This is confirmed by 
the following theorem. 

Smoothness of sequential tangency. Suppose that (A m ) is a sequence in 
a matrix Lie group G such that A m — ► 1 as m — ► and that (a,,,) is a 
sequence of real numbers such that (A,„ — 1) /a m — > X as m —> °° 

Then e tX E G for all real t (and therefore X is the tangent at 1 to the 
smooth path e ,x ). 

Proof. LetX = 1 i m„, A "' 1 . First we prove that e x E G. Then we indicate 

how the proof may be modified to show that e tX E G. 

Given that (A,„ — 1) / a, n — > X as m it follows that a m — > 0 as 
A m —> 1, and hence 1 / a m — > co _ Then if we set 

a m = nearest integer to 1 / a m , 

we also have a m (A m — 1) — > X as m — ► o°. Since a m is an integer, 


log(A" m ) = a m log(A,„) by the multiplicative property of log 

A, n 1 ( A m 1 ) 


— ®m{A m 1) a m (A m 1) 


+ ■ 


And since A m — > 1 we can argue as in Section 7.2 that the series in square 
brackets tends to zero. Then, since lim m ^ M a m (A m — 1) = X, we have 


X = Mm log(A^). 

m— >oo 

It follows, by the inverse property of log and the continuity of exp, that 

e x = lim A" m . 

m t oo 

Since a m is an integer, A"’” E G by the closure of G under products. And 
then, by the closure of G under nonsingular limits, 

e x = lim A“" E G. 

m— >oo 
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To prove that e tX G G for any real t one replaces 1 / a m in the above 
argument by t/a m . If 


b,„ = nearest integer to t / a m , 

we similarly have b m {A m — 1) — > tX as m — > And if we consider the 

series for 

log (A*J") = b m log(A m ) 

we similarly find that 

e tX = lim A* m G G 

m—>oo 

by the closure of G under nonsingular limits. □ 

This theorem is the key to proving that log maps a neighborhood of 1 in 
G onto a neighborhood of 0 in 71(G), as we will see in the next section. It 
is also the core of the result of von Neumann [1929] that matrix Lie groups 
are “smooth manifolds.” We do not define or investigate smooth manifolds 
in this book, but one can glimpse the emergence of “smoothness” in the 
passage from the sequence (A m ) to the curve e tX in the above proof. 

Exercises 

Having proved that sequential tangents are the same as the smooth tangents we 
considered previously, we conclude that sequential tangents have the real vector 
space properties. Still, it is interesting to see how the vector space properties 
follow from the definition of sequential tangent. 

7.3.1 If A and Y are sequential tangents to a group G at 1, show that X + Y is also. 

7.3.2 If A is a sequential tangent to a group G at 1, show that rX is also, for any 
real number r. 

7.4 The log function into the tangent space 

By a “neighborhood” of 1 in G we mean a set of the form 
A 5 (1) = {AGG:|A-1|<S}, 

where |fi| denotes the absolute value of the matrix B, defined in Section 4.5. 
We also call Ng( 1) the 8 -neighborhood of 1. Then we have the following 
theorem. 
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The log of a neighborhood of 1. For any matrix Lie group G there is a 
neighborhood Ng ( 1 ) mapped into T\(G ) by log. 

Proof. Suppose on the contrary that no Ng(l) is mapped into 71(G) by log. 
Then we can find Aj,A 2 ,A 3 , . . . £ G with A m — > 1 as m — ► and with each 
logA m 0 71(G). 

Of course, G is contained in some M„(C). So each logA,„ is in M„(C) 
and we can write 

logA m — X m T y rn ■ 


where X m is the component of logA„, in 71(G) and Y m / 0 is the component 
in 71(G)- 1 -, the orthogonal complement of 71(G) in M n ( C). We note that 
X m ,Y m — ► 0 as m — > °o because A m — » 1 and log is continuous. 

Next we consider the matrices T m /|T m | £ T\ (G) 1 . These all have ab- 
solute value 1, so they lie on the sphere 5? of radius 1 and center 0 in 
M„( C). It follows from the boundedness of 5? that the sequence (Y m /\Y m \) 
has a convergent subsequence, and the limit Y of this subsequence is also 
a vector in 71(G)- 1 of length 1. In particular, Y 0 71(G). 

Taking the subsequence with limit Y in place of the original sequence 
we have 



= Y. 


Finally, we consider the sequence of terms 


T — p A 

1 m — c **-m. • 


Each T m £ G because —X m £ 71(G); hence e' x,n £ G by the exponentiation 
of tangent vectors in Section 7.2, and A m £ G by hypothesis. On the other 
hand, A m = e x,,,+Ym by the inverse property of log, so 

7 ’ — sfimA- Y m 

m — e e 

= ( 1 - X - + § + ---) ^ +X m + Y m + + • • • ) 

= 1 + Y m + higher-order terms. 

Admittedly, these higher-order terms include X%, and other powers of X m , 
that are not necessarily small in comparison with Y m . However, these pow- 
ers of X m are those in 

1 = e~ Xm e Xm , 
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so they sum to zero. (I thank Brian Hall for this observation.) Therefore, 

T — 1 Y 

lim — — — = Inn — — = Y. 

m—>°° \Y m \ m—>°° \Y m \ 

Since each T m G G, it follows that the sequential tangent 



is in T\ (G) by the smoothness of sequential tangents proved in Section 7.3. 

But Y 0 7) (G), as observed above. This contradiction shows that our 
original assumption was false, so there is a neighborhood Ng( 1) mapped 
into 7j(G ) by log. □ 

Corollary. The log function gives a bijection, continuous in both direc- 
tions, between Ng( 1) in G and log (1 ) in T\ (G). 

Proof. The continuity of log, and of its inverse function exp, shows that 
there is a 1-to-l correspondence, continuous in both directions, between 
N§{\) and its image logA^l) in 7j(G). □ 

If TVg (1) in G is mapped into 7j(G) by log, then each A G A^(l) has the 
form A = e x , where X = logA G 7j (G). Thus the paradise of SO(2) and 
SU(2) — where each group element is the exponential of a tangent vector — 
is partly regained by the theorem above. Any matrix Lie group G has at 
least a neighborhood of 1 in which each element is the exponential of a 
tangent vector. 

The corollary tells us that the set logAg(l) is a “neighborhood” of 0 
in 7j(G) in a more general sense — the topological sense — that we will 
discuss in Chapter 8. The existence of this continuous bijection between 
neighborhoods finally establishes that G has a topological dimension equal 
to the real vector space dimension of 7) (G), thanks to the deep theorem 
of Brouwer [1911] on the invariance of topological dimension. This gives 
a broad justification for the Lie theory convention, already mentioned in 
Section 5.5, of defining the dimension of a Lie group to be the dimension 
of its Lie algebra. In practice, arguments about dimension are made at the 
Lie algebra level, where we can use linear algebra, so we will not actually 
need the topological concept of dimension. 

Exercises 

The continuous bijection between neighborhoods of 1 in G and of 0 in 7i(G) 
enables us to show the existence of nth roots in a matrix Lie group. 
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7.4.1 Show that each A £N§ (1) has a unique nth root, for n = 1,2,3, 

7.4.2 Show that the 2x2 identity matrix 1 has two square roots in SO(2), but that 
one of them is “far” from 1. 

7.5 SO in), SU(/i), and Sp(w) revisited 

In Section 3.8 we proved Schreier’s theorem that any discrete normal sub- 
group of a path-connected group lies in its center. This gives us the discrete 
normal subgroups of SO (n), SU(n), and Sp(n), since the latter groups are 
path-connected and we found their centers in Section 3.7. What remains is 
to find out whether SO (n), SU(«), and Sp(n) have any no/idiscrete normal 
subgroups. We claimed in Section 3.9 that the tangent space would enable 
us to see any nondiscrete normal subgroups, and we are finally in a position 
to explain why. 

For convenience we assume a plausible result that will be proved rigor- 
ously in Section 8.6: tfN§( 1) is a neighborhood of 1 in a path-connected 
group G, then any element of G is a product of members of Ng (1 ) . We say 
that Ng( 1) generates the whole group G. With this assumption, we have 
the following theorem. 

Tangent space visibility. If G is a path-connected matrix Lie group with 
discrete center and a nondiscrete normal subgroup H, then T\ (H) f {0}. 

Proof. Since the center Z(G) of G is discrete, and H is not, we can find a 
neighborhood Ng( 1) in G that includes elements B f 1 in H but no member 
of Z(G) other than 1. If B f 1 is a member of H in Ng(l), then B does not 
commute with some A E Ng(\). If B commutes with all elements of Ng( 1) 
then B commutes with all elements of G (because Alg(l) generates G), so 
B E Z(G), contrary to our choice of Ng( 1). 

By taking 8 sufficiently small we can ensure, by the theorem of the 
previous section, that A = e x for some X E T\(G). Indeed, we can ensure 
that the whole path A(t) = e tX is in Ng(l) for 0 < t < 1. 

Now consider the smooth path C(t ) = e ,x Be~ tX B ', which runs from 
1 to e x Be~ x B~ [ = ABA^ l B^ 1 in G. A calculation using the product rule 
for differentiation (exercise) shows that the tangent vector to C{t) at 1 is 

C'(0) =X-BXB~ l . 

Since H is a normal subgroup of G, and B £ H, we have e tX Be ,x E H. 
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Then e tX Be tX B 1 e H as well, so C(t) is in fact a smooth path in H and 
C'(0) =X-BXB~ 1 e T\(H). 

Thus to prove that T\ (H) / {0} it suffices to show that X — BXB 1 / 0. 
Well, 

X-BXB = Q^BXB~ x =X 
^e BXB ^ =e x 
=> Be x B- [ =e x 
=> Be x = e x B 
^BA = AB, 


contrary to our choice of A and B. 

This contradiction proves that T\ (H ) / {0}. □ 

Corollary. IfH is a nontrivial normal subgroup of G under the conditions 
above, then T\ (H) is a nontrivial ideal ofT\(G). 

Proof. We know from Section 6.1 that T\ (H ) is an ideal of T\ (G), and 
T\(H) f {0} by the theorem. 

If Ti(H) = T\ (G) then H fills N§( 1) in G, by the log-exp bijection 
between neighborhoods of the identity in G and T\ (G). But then H = G 
because G is path-connected and hence generated by N§( 1). Thus if H f G, 
then T\(H) / T\{G). □ 

It follows from the theorem that any nondiscrete normal subgroup H 
of G = SO(n),SU(n),Sp(n) gives a nonzero ideal T\{H ) in 7i(G). The 
corollary says that T\ (H) is nontrivial, that is, T\ (H) f T\(G) AH f G. 
Thus we finally know for sure that the only nontrivial normal subgroups of 
SO («), SU(n), and Sp(/i) are the subgroups of their centers. (And hence 
all the nontrivial normal subgroups are finite cyclic groups.) 

SO(3) revisited 

In Section 2.3 we showed that SO(3) is simple — the result that launched 
our whole investigation of Lie groups — by a somewhat tricky geometric 
argument. We can now give a proof based on the easier facts that the center 
of SO(3) is trivial, which was proved in Section 3.5 (also in Exercises 3.5.4 
and 3.5.5), and that so(3) is simple, which was proved in Section 6.1. The 
hard work can be done by general theorems. 
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By the theorem in Section 3.8, any discrete normal subgroup of SO(3) 
is contained in Z(SO(3)), and hence is trivial. By the corollary above, 
and the theorem in Section 6.1, any nondiscrete normal subgroup of SO(3) 
yields a nontrivial ideal of so(3), which does not exist. 

Exercises 

7.5.1 If C{t) = e tX Be~ tX B ~\ check that C'(0) = X-BXB~\ 

7.5.2 Give an example of a connected matrix Lie group with a nondiscrete normal 
subgroup H such that T\ (H) = {0}. 

7.5.3 Prove that U(«) has no nontrivial normal subgroup except Z(U(n)). 

7.5.4 The tangent space visibility theorem also holds if G is not path-connected. 
Explain how to modify the proof in this case. 


7.6 The Campbell-Baker-Hausdorff theorem 


The results of Section 7.4 show that, in some neighborhood of 1, any two 
elements of G have the form e x and e Y for some X,Y in g, and that the 
product of these two elements, e x e Y , is e 7 for some Z in g. The Campbell- 
Baker-Hausdorff theorem says that more than this is true, namely, the Z 
such that e x e 1 = e 7 is the sum of a series X + Y - f Lie bracket terms com- 
posed from X and Y . In this sense, the Lie bracket on g “determines” the 
product operation on G. 

To give an inkling of how this theorem comes about, we expand e x 
and e Y as infinite series, form the product series, and calculate the first few 
terms of its logarithm, Z. By the definition of the exponential function we 
have 


x „ X X 

2 =1 + TT + 


x 3 

2!~ + l3!’ 


V . Y Y 

2 = 1 + T7 + 


Y 3 

2! + 37 + ' 


and therefore 


y v X 2 Y 2 X m Y n 

e x e Y = l +X + Y+XY+ — + — + ■■■ + — - + ■■■ 

2! 2! m\n\ 


with a term for each pair of integers m,n > 0. It follows, since 


W 2 W 3 W 4 

log(l + W)=W- — + -- — + ■ 
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that 

Z = log(eV)= [x + Y + XY + ^ + Y - + --^j 

If X 2 Y 2 \ 2 

-2{ X + Y+XY+ 2! + 2! + -) 

If X 2 Y 2 \ 3 

+ _(x + W + _ + _ + ...j 

= X + Y + —XY — -YX + higher-order terms 

= X + T + - [X , Y\ + higher-order terms . 

The hard part of the Campbell-Baker-Hausdorff theorem is to prove that 
all the higher-order terms are composed from X and Y by Lie brackets. 

Campbell attempted to do this in 1897. His work was amended by 
Baker in 1905, with further corrections by Hausdorff producing a com- 
plete proof in 1906. However, these first proofs were very long, and many 
attempts have since been made to derive the theorem with greater economy 
and insight. Modern textbook proofs are typically only a few pages long, 
but they draw on differentiation, integration, and specialized machinery 
from Lie theory. 

The most economical proof I know is one by Eichler [1968]. It is only 
two pages long and purely algebraic, showing by induction on n that all 
terms of order n > 1 are linear combinations of Lie brackets. The algebra 
is very simple, but ingenious (as you would expect, since the theorem is 
surely not trivial). In my opinion, this is also an insightful proof, showing 
as it does that the theorem depends only on simple algebraic facts. I present 
Eichler’s proof, with some added explanation, in the next section. 

Exercises 

7 . 6.1 Show that the cubic term in log(e x e Y ) is 

( X 2 Y + XY 2 + YX 2 + Y 2 X - 2 XYX - 2 YXY) . 

7 . 6.2 Show that the cubic polynomial in Exercise 7.6. 1 is a linear combination of 
[X,[X,Y]) and [Y,[Y,X]). 
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The idea of representing the Z in e z = e x e Y by a power series in noncommuting 
variables X and Y allows us to prove the converse of the theorem that XY = YX 
implies e x e Y = e x+Y . 

7 . 6.3 Suppose that e x e Y = e Y e x . By appeal to the proof of the log multiplicative 
property in Section 7. 1, or otherwise, show that XY = YX. 

7 . 6.4 Deduce from Exercise 7.6.3 that e x e Y = e x+i if and only if XY = YX. 


7.7 Eichler’s proof of Campbell-Baker-Hausdorff 

To facilitate an inductive proof, we let 

e A e B = e z , Z = F 1 (A,B)+F 2 (A,B)+F 3 (A,B) + ---, (*) 

where F n (A,B) is the sum of all the terms of degree n in Z, and hence is a 
homogeneous polynomial of degree n in the variables A and B. Since the 
variables stand for matrices in the Lie algebra p, they do not generally com- 
mute, but their product is associative. From the calculation in the previous 
section we have 

Fi(A,B) =A + B , F 2 (A,B) = l -(AB-BA) = l -[A,B], 

We will call a polynomial p(A,B,C , . . .) Lie if it is a linear combination 

of A,7?,C, . . . and (possibly nested) Lie bracket terms in A.B.C. Thus 

F] (A.B) and FA A . B) are Lie polynomials, and the theorem we wish to 
prove is: 

Campbell-Baker-Hausdorff theorem. For each n > 1, the polynomial 
F n (A,B) in (*) is Lie. 

Proof. Since products of A,B,C, ... are associative, the same is true of 
products of power series in A,B,C, . . so for any A,B,C we have 

(e A e B )e c = e A (e B e c ), 

and therefore, if e A e B e c = e w , 

oo/oo \ oo / oo \ 

w = IX F j( A ’ B )’ C J = X F U XFj(B,C)J . (1) 

Our induction hypothesis is that F m is a Lie polynomial for in < n, and we 
wish to prove that F„ is Lie. 
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The induction hypothesis implies that all homogeneous terms of degree 
less than n in both expressions for IT in (1) are Lie, and so too are the 
homogeneous terms of degree n resulting from i > 1 and j > 1 . The only 
possible exceptions are the polynomials 

F n (A,B) + F n (A + B.C) on the left (from i = I . / = n and i = n.j = 1), 
F n (A,B + C) + F„(B,C) on the right (from i = nj = 1 and i = 1 ,j = n). 

Therefore, equating terms of degree n on both sides of (1), we find that the 
difference between the exceptional polynomials is a Lie polynomial. This 
property is a congruence relation between polynomials that we write as 

F n (A,B) + F n (A + B,C ) = Lie F n (A,B + C) + F n (B.C). (2) 

Relation (2) yields many consequences, by substituting special values of 
the variables A, B, and C, and from it we eventually derive F n (A,B ) =] je 0, 
thus proving the desired result that F n is Lie. 

Before we start substituting, here are three general facts concerning 
real multiples of the variables. 

1. F n (rA.sA) = 0, because the matrices rA and sA commute and hence 
e’ ' A e sA = e rA+sA . That is, Z = F\ (rA.sA), so all other F„(rA,sA ) = 0. 

2. In particular, r = 1 and 5 = 0 gives F n (A, 0) = 0. 

3. F n (rA,rB ) = r n F n (A,B) because F n is homogeneous of degree n. 
These facts guide the following substitutions in the congruence (2). 

Lirst, replace C by —B in (2), obtaining 

F n (A,B)+F n (A + B,-B ) = Ue F n (A,0) + F n (B , —B) 

=Lie 0 by facts 2 and 1 . 


Therefore 

F n (A,B) = Lie -F n (A + B,-B). (3) 

Then replace A by — B in (2), obtaining 

F n (-B,B) + F n (0,C) = Lie F n (-B,B + C) + F n (B,C), 
which gives, by facts 1 and 2 again, 


0 = Lie F n (-B,B + C) + F n (B,C). 
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Next, replacing B, C by A, B respectively gives 

0 =Lie F n {— A, A + B) +F„(A,B), 

and hence 

F„(A,B) =ue —F n (—A,A + B). (4) 

Relations (3) and (4) allow us to relate F n (A,B ) to F n (B.A) as follows: 

F„(A,B) = Lie -F n (-A,A + B) by (4) 

=Lie -{-F n (-A+A + B,-A-B)) by (3) 

=Lie F n (B,—A — B) 

=Lie —F„(—B, —A) by (4) 

=Lie -(-1 ) n F n (B,A) by fact 3. 

Thus the relation between F n (A,B ) and F n (B,A) is 

F„(A,B) = Lie -(-1 ) n F n (B,A). (5) 

Second, we replace C by —B/2 in (2), which gives 

F n {A,B) + F n {A + B,-B/ 2) = Lie F„(A,B/2) + F„(B,-B/2) 

=Lie F n (A,B/2) by fact 1, 


so 


F„(A,B) = Lie F n (A,B/2)-F n (A + B,-B/2). (6) 

Next, replacing A by — B/2 in (2) gives 

F n (-B/2,B) + F n (B/2,C) = Lie F n (-B/2,B + C)+F n (B,C), 
and therefore, by fact 1, 

F n (B/2,C) =Lie F n (-B/2,B + C)+F n (B,C). 

Then, replacing B, C by A, B respectively gives 

F„(A/2,B) = L ie F„(-A/2,A + B)+F„(A,B), 

that is, 

F n {A,B) = Ue F n (A/2,B) — F n (—A/2,A + B). (7) 
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Relations (6) and (7) allow us to pass from polynomials in A, B to 
polynomials in A/2, B/2, paving the way for another application of fact 3 
and a new relation, between F n (A,B) and itself. 

Relation (6) allows us to rewrite the two terms on the right side of (7) 
as follows: 

Fn(A/2,B ) =ue F n {A/2,B/2)-F n {A/2 + B, -B/2) by (6) 

= L ie F„(A/2,B/2) +F„(A/2 + B/2,B/2) by (3) 

= L ie 2~ n F n (A,B) + 2~ n F n (A + B,B) by fact 3, 

F n {-A/2A + B) 

=Lie F n (— A/2, A/2+B/2)—F n {A/2+B,— A/2— B/2) by (6) 
=Lie ~ F n (A/2,B/2) + F n (B /2,A/2 + B/2) by (4) and (3) 
=Lie -2~ n F n (A,B) + 2- n F n (B,A + B) by fact 3. 

So (7) becomes 

F n (A,B) =Lie 2 l - n F n (A,B) + 2- n F n (A + B,B)-2-"F ll (BA + B), 

and, with the help of (5), this simplifies to 

(1 -2 1 ~ n )F n {A,B) = Ue 2-"(l + (-l ) n )F n (A + B,B). (8) 

If 72 is odd, (8) already shows that F n (A,B) =u e 0. 

If 72 is even, we replace A by A — B in (8), obtaining 


(1-2 l ~ n )F n {A-B,B) = L ie 2 l ~ n F n (A,B). 


(9) 


The left side of (9) 


(\-2 l ~ n )F n (A-B,B) = Lie — (1 — 2 1 ~ n )F n (A, —B) by (3), 


so, making this replacement, (9) becomes 

2 1 ~ n 


F n (A , B) —Lie 1 2 1 — n Ffi (^ j B) . 


( 10 ) 


Finally, replacing B by —B in (10), we get 

2 1- ” 


F n (A,B) —Lie ^ 2l - n Fn{A, B) 



and this implies F n (A,B) =Lie 0, as required. 


□ 
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Exercises 

The congruence relation (6) 

F n (A,B) = Lle -(-1 ) n F n (B,A) 

discovered in the above proof can be strengthened remarkably to 

F n {A,B) = -{-\) n F n {B,A). 

Here is why. 

7 . 7.1 If Z(A,B) denotes the solution Z of the equation e A e B = e z , explain why 
Z(-B,-A ) = -Z(A,B). 

7 . 7.2 Assuming that one may “equate coefficients’’ for power series in noncom- 
muting variables, deduce from Exercise 7.7.1 that 

F n (A,B)=-(-l) n F n (B,A). 

7.8 Discussion 

The beautiful self-contained theory of matrix Lie groups seems to have 
been discovered by von Neumann [1929]. In this little-known paper 5 von 
Neumann defines the matrix Lie groups as closed subgroups of GL(«,C), 
and their “tangents” as limits of convergent sequences of matrices. In this 
chapter we have recapitulated some of von Neumann’s results, streamlin- 
ing them slightly by using now-standard techniques of calculus and linear 
algebra. In particular, we have followed von Neumann in using the ma- 
trix exponential and logarithm to move smoothly back and forth between 
a matrix Lie group and its tangent space, without appealing to existence 
theorems for inverse functions and the solution of differential equations. 

The idea of using matrix Lie groups to introduce Lie theory was sug- 
gested by Howe [1983]. The recent texts of Rossmann [2002], Hall [2003], 
and Tapp [2005] take up this suggestion, but they move away from the ideas 
of von Neumann cited by Howe. All put similar theorems on center stage — 
viewing the Lie algebra g of G as both the tangent space and the domain 
of the exponential function — but they rely on analytic existence theorems 
rather than on von Neumann’s rock-bottom approach through convergent 
sequences of matrices. 

5 The only book I know that gives due credit to von Neumann’s paper is Godement 
[2004], where it is described on p. 69 as “the best possible introduction to Lie groups” and 
“the first ‘p ro P er ’ exposition of the subject.” 
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Indeed, von Neumann's purpose in pursuing elementary constructions 
in Lie theory was to explain why continuity apparently implies differentia- 
bility for groups, a question raised by Hilbert in 1900 that became known as 
Hilbert’s fifth problem. It would take us too far afield to explain Hilbert’s 
fifth problem more precisely than we have already done in Section 7.3, 
other than to say that von Neumann showed that the answer is yes for com- 
pact groups, and that Gleason, Montgomery, and Zippin showed in 1952 
that the answer is yes for all groups. 

As mentioned in Section 4.7, Hamilton made the first extension of the 
exponential function to a noncommutative domain by defining it for quater- 
nions in 1843. He observed almost immediately that it maps the pure imag- 
inary quaternions onto the unit quaternions, and that e q+q = e q e q when 
qq' = q'q. He took the idea further in his Elements of Quaternions of 
1 866, realizing that e q+q is not usually equal to e q e q , because of the non- 
commutative quaternion product. On p. 425 of Volume I he actually finds 
the second-order approximation to the Campbell-Baker-Hausdorff series: 

e q+q — e q e q = HL—S—L q_ terms of third and higher dimensions. 

The early proofs (or attempted proofs) of the general Campbell-Baker- 
Hausdorff theorem around 1900 were extremely lengthy — around 20 pages. 
The situation did not improve when Bourbaki developed a more concep- 
tual approach to the theorem in the 1960s. See for example Serre [1965], or 
Section 4 or Bourbaki [1972], Chapter II. Bourbaki believes that the proper 
setting for the theorem is in the framework of free magmas, free algebras, 
free groups, and free Lie algebras, all of which takes longer to explain 
than the proofs by Campbell, Baker, and Hausdorff. It seems to me that 
these proofs are totally outclassed by the Eichler proof I have used in this 
chapter, which assumes only that the variables A, B, C have an associative 
product, and uses only calculations that a high-school student can follow. 

Martin Eichler (1912-1992) was a German mathematician (later living 
in Switzerland) who worked mainly in number theory and related parts of 
algebra and analysis. A famous saying, attributed to him, is that there are 
five fundamental operations of arithmetic: addition, subtraction, multipli- 
cation, division, and modular forms. Some of his work involves orthogonal 
groups, but nevertheless his 1968 paper on the Campbell-Baker-Hausdorff 
theorem seems to come out of the blue. Perhaps this is a case in which an 
outsider saw the essence of a theorem more clearly than the experts. 
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Topology 


Preview 

One of the essential properties of a Lie group G is that the product and in- 
verse operations on G are continuous functions. Consequently, there comes 
a point in Lie theory where it is necessary to study the theory of continuity, 
that is, topology. Our journey has now reached that point. 

We introduce the concepts of open and closed sets, in the concrete set- 
ting of ^-dimensional Euclidean space M /( , and use them to explain the re- 
lated concepts of continuity, compactness, paths, path-connectedness, and 
simple connectedness. The first fruit of this development is a topological 
characterization of matrix Lie groups, defined in Section 7.2 through the 
limit concept. 

All such groups are subgroups of the general linear group GL(n,C) 
of invertible complex matrices, for some n. They are precisely the closed 
subgroups of GL(n,C). 

The concepts of compactness and path-connectedness serve to refine 
this description. For example, O (n) and SO(n) are compact but GL(n.C) 
is not; SO(n) is path-connected but O (n) is not. 

Finally, we introduce the concept of deformation of paths, which al- 
lows us to define simple connectivity. A simply connected space is one in 
which any two paths between two points are deformable into each other. 
This refines the qualitative description of Lie groups further — for exam- 
ple, SU(2) is simply connected but SO(2) is not — but simply connected 
groups have a deeper importance that will emerge when we reconnect with 
Lie algebras in the next chapter. 
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8.1 Open and closed sets in Euclidean space 

The geometric setting used throughout this book is the Euclidean space 
R k = {(xi,x 2 ,...,x k ) :xi,x 2 ,...,x k G M}, 
with distance d(X,Y) between points 

X = (xi,x 2 ,...,x k ) and Y = (yi,y 2) . ■ • ,y k ) 

defined by 

d(X,Y) = \J (x\ — yi) 2 + (.^2 — y2) 2 H \-(x k -y k ) 2 . 

This is the distance on R k that is invariant under the transformations in 
the group 0(k) and its subgroup SO (k). Also, when we interpret C" as 
M 2 " by letting the point {x\ + ix \ . vt + ix' 2 .... ,x n + ix' n ) G C" correspond 
to the point [x\ .x\ . . . ,x n ,x' n ) G M 2 '' then the distance defined by the 

Hermitian inner product on C' 1 is the same as the Euclidean distance on 
M 2,! , as we saw in Section 3.3. Likewise, the distance on H" defined by its 
Hermitian inner product is the same as the Euclidean distance on M 4 ". 

As in Section 4.5 we view an n x n real matrix A with (i,j) -entry a,y 

2 

as the point (an, < 212 , ■ ■ • ,ain, a 2 U- ■ ■ ,«««) G M ,r , and define the absolute 

value |A| of A as the Euclidean distance yJlLij a jj of this point from 0 in 

2 

M" . We similarly define the absolute value of n x n complex and quater- 
nion matrices by interpreting them as points of M 2 "" and M 4 ' 1 , respectively. 
Then if we take the distance between matrices A and B of the same size 
and type to be |A — B\, we can speak of a convergent sequence of matrices 
Ai ,A 2 ,A 3 , . . . with limit A, or of a continuous matrix-valued function A(t) 
by using the usual definitions in terms of distance e from the limit. 

Topology gives a general language for the discussion of limits and con- 
tinuity by expressing them in terms of open sets. 

Open and closed sets 

To be able to express the idea of a “neighborhood” concisely we introduce 
the notation N e (P) for the open e-ball with center P. that is, 

iV £ (l , ) = {eGK t : \P-Q\ <e}. 
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The set N e (P) is also called the e-neighborhood ofP. 

A set G C M 6 7 is called open if, along with any point P 6 f)', there is an 
£ -neighborhood N e (P) C // for some e > 0. Three properties of open sets 
follow almost immediately from this definition. 6 

1. Both M 7 and the empty set {} are open. 

2. Any union of open sets is open. 

3. The intersection of two (and hence any finite number of) open sets is 
open. 

The third property holds because if P € f)\ and P G Gi then we have 

P G N Ei (P) C 0 x and PeN £2 (P) C 0 2 , 

so P £ N e (P) C f)\ n A?, where £ is the minimum of £1 and £ 2 . 

Open sets are the fundamental concept of topology, and all other topo- 
logical concepts can be defined in terms of them. For example, a closed 
set 7 'p- is one whose complement M 7 — is open. It follows from prop- 
erties 1, 2, 3 of open sets that we have the following properties of closed 
sets: 

1. Both M 7 and the empty set {} are closed. 

2. Any intersection of closed sets is closed. 

3. The union of two (and hence any finite number of) closed sets is 
closed. 

The reason for calling such sets “closed” is that they are closed under the 
operation of adding limit points. A limit point of a set SP is a point P 
such that every £ -neighborhood of P includes points of 5P . A closed set & 
includes all its limit points P. This is so because if P is a point not in & 
then P is in the open complement M 7 — & and hence P has a neighborhood 
N e {P ) CM 7 — But then N e (P) does not include any points of ,SF, so P 

is not a limit point of & . 

6 In general topology, where R 7 is replaced by an arbitrary set S’, these three properties 
define what is called a collection of open sets. In general topology there need be no under- 
lying concept of “distance,” hence open sets cannot always be defined in terms of e-balls. 
We will make use of the concept of distance where it is convenient, but it will be noticed 
that the general topological properties of open sets frequently give a natural proof. 

7 It is traditional to denote closed sets by the initial letter of “ferme,” the French word 
for “closed.” 


8.1 Open and closed sets in Euclidean space 


163 


The relative topology 

Many spaces SP other than R* have a notion of distance, so the definition 
of open and closed sets in terms of e-balls may be earned over directly. In 
particular', if SP is a subset of some M /( we have: 

• The e-balls of SP , N e (P ) = {Q G SP : \P — Q\ < e}, are the intersec- 
tions of SP with e-balls of R k . 

• So the open subsets of SP are the intersections of SP with the open 
subsets of M. k . 

• So the closed subsets of SP are the intersections of SP with the closed 
subsets of R k . 

The topology resulting from this definition of open set is called the relative 
topology on SP . It is important at a few places in this chapter, notably for 
the definition of a matrix Lie group in the next section. 

Notice that SP is automatically a closed set in the relative topology, 
since it is the intersection of SP with a closed subset of R*, namely U. k 
itself. This does not imply that SP contains all its limit points; indeed, this 
happens only if SP is a closed subset of 'Bi k . 

Exercises 

Open sets and closed sets are common in mathematics. For example, an open 
interval (a,b) = {x £ R : a < x < b} is an open subset of R and a closed interval 
[fl,fe] = {xeR:a<r<fc}is closed. 

8 . 1.1 Show that a half-open interval [a ,b) = {x : a < x < b} is neither open nor 
closed. 

8 . 1.2 With the help of Exercise 8.1.1, or otherwise, give an example of an infinite 
union of closed sets that is not closed. 

8 . 1.3 Give an example of an infinite intersection of open sets that is not open. 

Since a random subset SP of a space SP may not be closed we sometimes 
find it convenient to introduce a closure operation that takes the intersection of all 
closed sets & D SP\ 

closure(iT) = n{JP C SP : SP is closed and & S SP}. 

8 . 1.4 Explain why closure(^) is a closed set containing SP . 

8 . 1.5 Explain why it is reasonable to call closure(^T) the “smallest” closed set 
containing SP . 

8 . 1.6 Show that closure(^) = SP U {limit points of SP} when SP C RT 
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8.2 Closed matrix groups 

In Lie theory, closed sets are important from the beginning, because all 
matrix Lie groups are closed sets in the appropriate topology. This has to do 
with the continuity of matrix multiplication and the determinant function, 
which we assume for now. In the next section we will discuss continuity 
and its relationship with open and closed sets more thoroughly. 

Example 1. The circle group a 1 = SO(2). 

Viewed as a set of points in C or M 2 , the unit circle is a closed set 
because its complement (the set of points not on the circle) is clearly open. 
Figure 8.1 shows a typical point P not on the circle and an e -neighborhood 
of P that lies in the complement of the circle. The open neighborhood of P 
is colored gray and its perimeter is drawn dotted to indicate that boundary 
points are not included. 



Figure 8.1: Why the complement of the circle is open. 

Example 2. The groups 0(n) and SO(n). 

2 

We view 0(n) as a subset of the space R" of n x n real matrices, which 
we also call Af„(R). The complement of O(n) is 

M„(R) - O(n) = {A€ M„(R) : AA J / 1}. 

This set is open because if A is a matrix in M„(R) with AA 1 / 1 then 
some entries of AA T are unequal to the corresponding entries (1 or 0) in 1. 
It follows, since matrix multiplication and transpose are continuous, that 
BB t also has entries unequal to the corresponding entries of 1 for any B 
sufficiently close to A. Thus some £ -neighborhood of A is contained in 
M„(R) — 0(n), so O (n) is closed. 
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Matrices A in SO(n) satisfy the additional condition det(A) = 1. The 
matrices A not satisfying this condition form an open set because det is 
a continuous function. Namely, if det (A) f 1, then det(B) f 1 for any B 
sufficiently close to A; hence any A in the set where det / 1 has a whole 
e -neighborhood in this set. Thus the matrices A for which det(A) = 1 form 
a closed set. The group SO (n) is the intersection of this closed set with the 
closed set O (n), hence SO (n) is itself closed. 

Example 3. The group Aff(l). 

We view Aff(l) as in Section 4.6, namely, as the group of real matrices 
of the form A = ( g \ ) , where a,beR and a > 0. It is now easy to see that 
the group is not closed, because it contains the sequence 



whose limit (§5) is not in Aff(l). However, Aff(l) is closed in the “rel- 
ative” sense: as a subset of the largest 2x2 matrix group that contains it. 
This is because Aff(l) is the intersection of a closed set — the set of ma- 
trices ( o i ) with a > 0 — with the set of all invertible 2x2 matrices. This 
brings us to our next example. 

Example 4. The general linear group GL(n,C). 

The group GL(w,C) is the set of all invertible n x n complex matrices. 
This set is a group because it is closed under products (since A 1 B ] = 
(BA) -1 ) and under inverses (obviously). It follows that every group of real 
or complex matrices is a subgroup of some GL(n,C), 8 which is why we 
bring it up now. We are about to define what a “matrix Lie group” is, and 
we wish to say that it is some kind of subgroup of GL(n,C). 

But first notice that GL(n,C) is not a closed subset of the space M„( C) 
of n x n complex matrices. Indeed, if 1 is the n x n identity matrix, then the 
matrices 1/2, 1/3, 1/4, ... all belong to GL(n,C) but their limit 0 does not. 
We can say only that GL (n, C) is a closed subset of itself and the definition 
of matrix Lie group turns upon this appeal to the relative topology. 

s GL(n,C) was called “Her All-embracing Majesty” by Hermann Weyl in his book The 
Classical Groups. Notice that quaternion groups may also be viewed as subgroups of 
GL(n,C), thanks to the identification of quaternions with certain 2x2 complex matrices 
in Section 1.3. 
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Matrix Lie groups 

With the understanding that the topology of all matrix groups should be 
considered relative to GL(n,C), we make the following definition: 

Definition. A matrix Lie group is a closed subgroup o/GL(n,C). 

This definition is beautifully simple, but still surprising. Lie groups are 
supposed to be “smooth,” yet closed sets are not usually smooth (think of a 
square or a triangle, say). Apparently the group operation has a “smooth- 
ing” effect. And again, there are some closed subgroups of GL(n,C) that 
do not even look smooth, for example the group {1} consisting of a single 
point! The worry about { 1 } disappears when one takes a sufficiently gen- 
eral definition of “smoothness,” as explained in Section 5.8. The real secret 
of smoothness is the matrix exponential function, as we saw in Section 7.3. 

Exercises 

8.2.1 Prove that U(n), SU(n), and Spin) are closed subsets of the appropriate 
matrix spaces. 

The general linear group GL(n,C) is usually introduced alongside the special 
linear group. Both are subsets of the space M„(C) of complex nxn matrices. 

GL(n,C) = {A : det(A) f 0} and SL(n,C) = {A : det(A) = 1}. 

8.2.2 Show that GL(n,C) is an open subset of M n ( C). 

8.2.3 Show that SL(n,C) is a closed subset of M„(C). 

8.2.4 If H is an arbitrary subgroup of a matrix Lie group G, show that 

{sequential tangents of H} = 7i(closure(//)). 


8.3 Continuous functions 

As in elementary analysis, we define a function / to be continuous at a 
point A if, for each £ > 0, there is a 8 > 0 such that 

\B-A\<8^\f(B)-f(A)\<e. 

If the points A and B belong to M. k and the values /(A) and f{B) belong 
to M / then the e-8 condition can be restated as follows: for each e-ball 
N e (f(A )) there is a 8 -ball Ng(A) such that 

{f(B):BGN 8 (A)}CN e (f(A)). (*) 


8.3 Continuous functions 
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It is convenient and natural to introduce the abbreviations 

/(■?) for {/(B):Sen f~\^) for {fi:/(B)e^}. 

Then the condition (*) can be restated: / is continuous at A if, for each 
e > 0, there is a 8 > 0 such that 

f(N s (A))CN e (f(A)). 

Finally, if / is continuous for some domain of argument values A and some 
range of function values /(A) then, for each open subset G of the range of 
f, we have 

f~ l (G) isopen. (**) 

This is because f~ l (G) contains, along with each point A, a neighborhood 
Ns (A) of A, mapped by / into an neighborhood N e (f(A)) of /(A), con- 
tained in the open set G along with /(A). 

Condition (**) is equivalent to condition (*) in spaces such as M*, and it 
serves as the definition of a continuous function in general topology, since 
it is phrased in terms of open sets alone. 


Basic continuous functions 


As one learns in elementary analysis, the basic functions of arithmetic are 
continuous at all points at which they are defined. Also, composites of con- 
tinuous functions are continuous. For example, the composite of addition, 
subtraction, and division given by 


f(a,b) 


a + b 
a — b 


is continuous for all pairs (a, b) at which it is defined — that is, for all pairs 
such that afb. 

A matrix function / is called continuous at A if it satisfies the e-8 
definition for absolute value of matrices. That is, for all £ there is a 8 such 
that 

|j5 - A| < <5 \f(B) — f(A) \ < e. 

This is equivalent to being a continuous numerical function of the matrix 
entries. Important examples for Lie theory are the matrix product and the 
determinant, both of which are continuous because they are built from ad- 
dition and multiplication of numbers. The matrix inverse /(A) = A -1 is 
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also a continuous function of A, built from addition, multiplication, and di- 
vision (by det(A)). It is defined for all A with det(A) / 0, which of course 
are also the A for which A” 1 exists. 

Homeomorphisms 

Continuous functions might be considered the “homomorphisms” of topol- 
ogy, but if so an “isomorphism” is not simply a 1-to-l homomorphism. A 
topological isomorphism should also have a continuous inverse. A contin- 
uous function / such that / 1 exists and is continuous is called a homeo- 
morphism. We will also call such a 1-to-l correspondence, continuous in 
both directions, a continuous bijection. 

We must specifically demand a continuous inverse because the inverse 
of a continuous 1-to-l function is not necessarily continuous. The simplest 
example is the map from the half-open interval [0,2 n) to the circle defined 
by f(6) = cos 6 + /sin0 (Figure 8.2). 



This map / is clearly continuous and 1-to-l, but / 1 is not continuous. 
For example, / 1 ( A') , where f/ is a small open arc of the circle between 
angle —a and a, is (2 n — a,2n) U [0, a), which is not an open set. (More 
informally, / - 1 sends points that are near each other on the circle to points 
that are far apart on the interval.) 

It is clear that / is not an “isomorphism” between [0, In) and the circle, 
because the two spaces have different topological properties. For example, 
the circle is compact but [0, 2 n) is not. (For the definition of compactness, 
see the next section.) 

Exercises 

If homeomorphisms are the “isomorphisms” of topological spaces, what operation 
do they preserve? The answer is that homeomorphisms are the 1-to-l functions f 
that preserve closures, where “closure” is defined in the exercises to Section 8.1: 

/(closure^)) = closure(/(A^)). 


8.4 Compact sets 
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8.3.1 Show that if P is a limit point of and / is a continuous function defined 
on SA and P, then f{P) is a limit point of f {■¥’). 

8.3.2 If / is a continuous bijection, deduce from Exercise 8.3.1 that 

/(closure^)) = closure(/(c5^)). 

8.3.3 Give examples of continuous functions / on subsets of R such that /(open) 
is not open and /(closed) is not closed. 

8.3.4 Also, give an example of a continuous function / on R and a set ,'A such 
that 

/(closure^)) f closure(/(^)). 

8.4 Compact sets 

A compact set in M. k is one that is closed and bounded. Compact sets 
are somewhat better behaved than unbounded closed sets; for example, on 
a compact set a continuous function is uniformly continuous, and a real- 
valued continuous function attains a maximum and a minimum value. One 
learns these results in an introductory real analysis course, but we will 
prove one version of uniform continuity below. In Lie theory, compact 
groups are better behaved than noncompact ones, and fortunately most of 
the classical groups are compact. 

We already know from Section 8.2 that O (n) and SO(n) are closed. To 
see why they are compact, recall from Section 3.1 that the columns of any 
A S O (n) form an orthonormal basis of M". This implies that the sum of 
the squares of the entries in any column is 1 , hence the sum of the squares 

of all entries is n. In other words, |A| = s/n, so O (n) is a closed subset of 

2 

M' r bounded by radius fin. 

There are similar proofs that U(n), SU(n), and Sp(n) are compact. 
Compactness may also be defined in terms of open sets, and hence it is 
meaningful in spaces without a concept of distance. The definition is moti- 
vated by the following classical theorem, which expresses the compactness 
of the unit interval [0, 1] in terms of open sets. 

Heine-Borel theorem. If [0,1] is contained in a union of open intervals 
°//„ then the union of finitely many also contains [0, 1]. 

Proof. Suppose, on the contrary, that no finite union of the % contains 
[0, 1], Then at least one of the subintervals [0, 1/2] or [1/2, 1] is not con- 
tained in a finite union of % (because if both halves are contained in the 
union of finitely many so is the whole). 
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Pick, say, the leftmost of the two intervals [0,1/2] and [1/2,1] not con- 
tained in a finite union of % and divide it into halves similarly. By the same 
argument, one of the new subintervals is not contained in a finite union of 
the %i, and so on. 

By repeating this argument indefinitely, we get an infinite sequence of 
intervals [0, 1] = J?\ D D D ■ ■ ■ . Each J^ n+ \ is half the length of J? n 
and none of them is contained in the union of finitely many %. But there 
is a single point P in all the .f n (namely the common limit of their left and 
right endpoints), and P E [0, 1] so P is in some 

This is a contradiction, because a sufficiently small containing P is 
contained in a I/j, since A/j is open. So in fact [0, 1] is contained in the union 
of finitely many □ 

The general definition of compactness motivated by this theorem is the 
following. A set dXd is called compact if, for any collection of open sets 
ff whose union contains K, there is a finite subcollection f)'\ . f/i..... f/ m 
whose union contains JP . The collection of sets f)'i is said to be an “open 
cover” of JfT, and the subcollection G\,@ 2 , • • • , is said to be a “finite 
subcover” so the defining property of compactness is often expressed as 
“any open cover contains a finite subcover.” 

The argument used to prove the Heine-Borel theorem is known as the 
“bisection argument,” and it easily generalizes to a “2 k -section argument” 
in M. k , proving that any closed bounded set has the finite subcover property. 

For example, given a closed, bounded set dAA in M 2 , we take a square 
that contains JP and consider the subsets of .'Ad obtained by dividing the 
square into four equal subsquares, then dividing the subsquares, and so on. 
If has no finite subcover, then the same is true of a nested sequence of 
subsets with a single common point P, which leads to a contradiction as in 
the proof for [0, 1], 

Exercises 

The bisection argument is also effective in another classical theorem about the 
unit interval: the Bolzano-Weierstrass theorem , which states that any infinite set 
of points {Pi,Pi,P 3 , . . .} in [0, 1] has a limit point. 

8 . 4.1 Given an infinite set of points {Pi ,7*2, Pn • • •} in [0, 1], conclude that at least 
one of the subintervals [0, 1 /2], [1 /2, 1] contains infinitely many of the I). 

8 . 4.2 (Bolzano-Weierstrass). By repeated bisection, show that there is a point P 
in [0, 1], every neighborhood of which contains some of the points p. 


8.5 Continuous functions and compactness 
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8 . 4.3 Generalize the argument of Exercise 8.4.2 to show that if JC is a closed 
bounded set in R A containing an infinite set of points { P\ . Pi ■ P3 ■ ■ • then 
J(T includes a limit point of {Pi,Pz,P}, . . .}. 

(We used a special case of this theorem in Section 7.4 in claiming that an infi- 
nite sequence of points on the unit sphere has a limit point, and hence a convergent 
subsequence.) 

The generalized Bolzano- Weierstrass theorem of Exercise 8.4.3 may also be 
proved very naturally using the finite subcover property of compactness. Suppose, 
for the sake of contradiction, that {P\ .P 2 .P 3 , ■ . .} is an infinite set of points in a 
compact set K, with no limit point in K. It follows that each point Q € K has an 
open neighborhood jV(Q) in -Xf (the intersection of an open set with ■"£ ) free of 
points Pi ^ Q. 

8 . 4.4 By taking a finite subcover of the cover of JfP by the sets .XX ( Q ), show that 
the assumption leads to a contradiction. 

Not all matrix Lie groups are compact. 

8 . 4.5 Show that GL(«,C) and SL(n,C) are not compact. 


8.5 Continuous functions and compactness 

We saw in Section 8.3 and its exercises that continuous functions do not 
necessarily preserve open sets or closed sets. However, they do preserve 
compact sets, so this is another example of “better behavior” of compact 
sets. The proof also shows the efficiency of the finite subcover property of 
compactness. 

Continuous image of a compact set. If Jff is compact and f is a contin- 
uous function defined on .Xf then f(fXf) is compact. 

Proof. Given a collection of open sets Gj that covers fi/Xf), we have to 
show that some finite subcollection G\ , @2, ■ • ■ , G n also covers f(.Xf). 

Well, since / is continuous and Gi is open, we know that f~ l (Gj) is 
open by Property (**) in Section 8.3. Also, the open sets f~ x (Gi) cover 
Jf because the G\ cover f(dff) ■ Therefore, by compactness of dff , there 
is a finite subcollection f~ l (G\),f~ l (G2), . . ■ ,f ~ l (G m ) that covers K. 

But then G \ , G'i. ... ,G n covers f(,Xf), as required. □ 

It may be thought that a problem arises when the open sets G, extend 
outside f(J(f), possibly outside the range of the function /. We avoid this 
problem by considering only open subsets relative to - 'Xf and f(.XP), that 
is, the intersections of open sets with dff and f{-Xrf ) . For such sets it is still 
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true that / -1 (“open”) = “open” when / is continuous, and so the argument 
goes through. 

A convenient property of continuous functions on compact sets is uni- 
form continuity. As always, a continuous / : 8/ — » f? has the property 
that for each e > 0 there is a 8 > 0 such that / maps a 8 -neighborhood of 
each point P £ SP into an £ -neighborhood of f(P) £ . We say that / is 

uniformly continuous if 8 depends only on £, not on P. 

Uniform continuity. If ,'Xf is a compact subset ofW n and f : JP — » M" is 
continuous, then f is uniformly continuous. 

Proof. Since / is continuous, for any £ > 0 and any P £ JfT there is a neigh- 
borhood Ns(p)(P) mapped by / into N E / 2 {f(P)). To create some room to 
move later, we cover dfP with the half-sized neighborhoods Ng( P y 2 (P), 
then apply compactness to conclude that JP is contained in some finite 
union of them, say 

M C N S ( Pl y 2 (Pi) UN sm2 (P2) U • • • U N S ( Pk y 2 (Pk)- 


If we let 

8 = min {8(Pi)/2, 8(P>)/ 2, . . . , 8(P k )/2}, 

then each point in JP lies in a set N^(p j y 2 {P\) and each of the sets Ng(p:f Pj) 
has radius at least 28. I claim that \Q — /?| < 8 implies \f(Q) —/(/?) | < £ 
for any Q,R £ W, so / is uniformly continuous on W . 

To see why, take any Q,R £ such that \Q — < 8 and a half-sized 

neighborhood Ng( Pi y 2 (Pi) that includes Q. Then 

|P,-£|<<5 and |£-P|<<5, 

so it follows by the triangle inequality that 

| Pj — R\< 28, and hence R £ N§( Pj )(Pj). 

Also, it follows from the definition of Ng^ P y(Pi) that | f(Pj) — f(Q) \ < e/2 
and \f{Pi) -f(R)\ < e/2, so 

\f(Q)-f(R)\<£, 


again by the triangle inequality. 


□ 
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Exercises 

The above proof of uniform continuity is complicated by the possibility that JP is 
at least two-dimensional. This forces us to use triangles and the triangle inequality. 
If we have JP = [0, 1] then a more straightforward proof exists. 

8 . 5.1 Suppose that N^ P ^(P\) U • • ■ U NgfafiPk) is a finite union of 

open intervals that contains [0,1]. 

Use the finitely many endpoints of these intervals to define a number 8 > 0 
such that any two points P,Q £ [0,1] with \P — Q\ < 8 lie in the same 
interval N s{Pi) {Pi). 

8 . 5.2 Deduce from Exercise 8.5.1 that any continuous function on [0,1] is uni- 
formly continuous. 


8.6 Paths and path-connectedness 

The idea of a “curve” or “path” has evolved considerably over the course 
of mathematical history. The old term locus (meaning place in Latin), 
shows that a curve was once considered to be the (set of) places occupied 
by points satisfying a certain geometric condition. For example, a circle is 
the locus of points at a constant distance from a particular point, the center 
of the circle. Later, under the influence of dynamics, a curve came to be 
viewed as the orbit of a point moving according to some law of motion, 
such as Newton’s law of gravitation. The position p(t) of the moving point 
at any time t is some continuous function of t. 

In topology today, we take the function itself to be the curve. That is, 
a curve or path in a space 5? is a continuous function p : [0, 1] — * /S 1 . The 
interval [0, 1] plays the role of the time interval over which the point is in 
motion — any interval would do as well, and it is sometimes convenient to 
allow arbitrary closed intervals, as we will do below. More importantly, 
the path is the function p and not just its image. A case in which the 
image fails quite spectacularly to reflect the function is the space-filling 
curve discovered by Peano in 1890. The image of Peano’s curve is a square 
region of the plane, so the image cannot tell us even the endpoints A = /( 0) 
and B = /(l) of the curve, let alone how the curve makes its way from A 
to B. 

In Lie theory, paths give a way to distinguish groups that are “all of a 
piece,” such as the circle group SO(2), from groups that consist of “sep- 
arate pieces,” such as 0(2). In Chapter 3 we showed connectedness by 
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describing specific paths. In the present chapter we wish to discuss paths 
more generally, so we introduce the following general definitions. 

Definitions. A path in a set G is a continuous map p : I — > G, where 
I = [a,b\ is some closed interval 9 of real numbers. A set G is called path- 
connected if, for any A,B <G G, there is apath p : \a,b] — > G with p(a) =A 
and p(b ) = B. If' p is a path from A to B with domain \a,b\ and q is a path 
from B to C with domain [b,c] then we call the path pq defined by 



the concatenation of p and q. 

Clearly, if there is a path p from A to B with domain \a.b\ then there is 
a path p' from ,4 to B with any closed interval as domain. Thus if there are 
paths from 4 to B and from B to C we can always arrange for the domains 
of these paths to be contiguous intervals, so the concatenation of the two 
paths is defined. Indeed, we can insist that all paths have domain [0,1], at 
the cost of a slightly less natural definition of concatenation (this is often 
done in topology books). 

Whichever definition is chosen, one has the following consequences: 

• If there is a path from A to B then there is a “reverse” path from B 
to A. (If p with domain [0,1] is a path from A to B, consider the 
function q(t) = p( 1 — t).) 

• If there are paths in G from A to B, and from B to C, then there is a 
path in G from A to C. (Concatenate.) 

• If G° is the subset of G consisting of all A G G for which there is 
a path from 1 to A, then G° is path-connected. (For any B,C € G, 
concatenate the paths from B to 1 and from 1 to C.) 

In a group G, the path-connected subset G° just described is called the 
path-component of the identity, or simply the identity component. The set 
G° has significant algebraic properties. These properties were explored 
in some exercises in Chapter 3, but the following theorem and its proof 
develop them more precisely. 

9 We regret that mathematicians use the [ , ] notation for both closed intervals and Lie 
brackets, but it should always be clear from the context which meaning is intended. 
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Normality of the identity component. If G° is the identity component of 
a matrix Lie group G, then G° is a normal subgroup of G. 

Proof. First we prove that G° is a subgroup of G by showing that G° is 
closed under products and inverses. 

If A,B £ G° then there are paths A(t) from 1 to A and B(t) from 1 to B. 
Since matrix multiplication is continuous, AB(t) is a path in G from A to 
AB, so it follows by concatenation of paths from 1 to A and from A to AB 
that AB £ G°. Similarly, A ''A(t) is a path in G from A -1 to 1, so it follows 
by path reversal that A -1 is also in G° . 

To prove that G° is normal we need to show that AG°A _1 = G° for each 
A £ G. It suffices to prove that AG" A 1 C G° for each A £ G, because in 
that case we have G° C A 1 G"A (multiplying the containment on the left 
by A -1 and on the right by A), and hence also G° C AG" A 1 (replacing the 
arbitrary A by A -1 ). 

It is true that AG°A~ 1 C G°, because AG” A 1 is a path-connected set — 
the image of G° under the continuous maps of left and right multiplication 
by A and A~ 1 — and it includes the identity element of Gas A 1A l . □ 

It follows from this theorem that a non-discrete matrix Lie group is not 
simple unless it is path-connected. We know from Chapter 3 that O (n) is 
not path-connected for any n, and that SO (n), SU(n), and Sp(n) are path- 
connected for all n. Another interesting case, whose proof occurs as an 
exercise on p. 49 of Hall [2003], is the following. 

Path-connectedness of GL(n, C) 

Suppose that A and B are two matrices in GL(n,C), so det(A) f- 0 and 
det(fi) f 0. We wish to find a path from A to B in GL(n. C), that is, through 
the nxn complex matrices with nonzero determinant. 

We look for this path among the matrices of the form (1 — z)A + zB, 
where z £ C. These matrices form a plane, parameterized by the complex 
coordinate z, and the plane includes A at z = 0 and B at z = 1 . The path 
from A to B has to avoid matrices (1 — z)A + zB for which 

det((l -z)A + zB) = 0. (*) 

Now ( 1 — z)A + zB is an n x n complex matrix whose entries are linear 
terms in z. Its determinant is therefore a polynomial of degree at most n in 
z and so, by the fundamental theorem of algebra, equation (*) has at most 
n roots. 
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These roots represent n points in the plane of matrices ( 1 — z)A + zB, 
not including the points A and B. This allows us to find a path, from A to B 
in the plane, avoiding the points with determinant zero, as required. Thus 
GL(n,C) is path-connected. □ 

Generating a path-connected group from a neighborhood of 1 

In Section 7.5 we claimed that any path-connected group matrix Lie group 
is generated by a neighborhood N§( 1) of 1, that is, any element of G is a 
product of members of Ng(l). We can now prove this theorem with the 
help of compactness. 

Generating a path-connected group. If G is a path-connected matrix Lie 
group, and N§( 1) is a neighborhood ofl in G, then any element of G is a 
product of members ofN§(l). 

Proof. Since G is path-connected, for any AeG there is a path A(t) in 
G with A( 0) = 1 and A(l) = A. Also, for each t, multiplication by A(t) 
is a continuous map with a continuous inverse (namely, multiplication by 
A(t) '). Hence, if G is any open set that includes 1, the set 

A(t)G = {A(t)B : B e G} 

is an open set that includes the point A(t). As t runs from 0 to 1 the open 
sets A(t) 6 cover the image of the path A (t), which is the continuous image 
of the compact set [0, 1], hence compact by the first theorem in Section 8.5. 
So in fact the image of the path lies in a finite union of sets, 

A(t\)GUA(t 2 )GU ■ • - U A(t k )G. 

We can therefore find points 1 = Ai,A 2 , . . . ,A m = A on the path Aft) 
such that, for any i, A, and A ;+ i lie in the same set A(tj)G. Notice that 

A =Ai -A^As-A^Aa A“i,A m . 

We can arrange that each factor of this product is in N§( 1) by taking G to be 
a subset of Ng( 1) small enough that B i 1 B, \ G N§( 1) for any B,.B, ( i G G. 
Then for each i we have 


A i 'A/ + i = ( A(tj)Bi ) l A(tf)Bi + 1 for some tj and some Bj,B i+ 1 G G 
= Bf 1 B i+l eN s ( 1). □ 
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Corollary. If G is a path-connected matrix Lie group then each element of 
G has the form e x 'e Xl ■ ■ ■e x "' for some X\ .Xj. . . . ,X m E T\ (G). 

Proof. Rerun the proof above with Ng( 1) chosen so that each element of 
N§( 1) has the form e x , as is permissible by the theorem of Section 7.4. 
Then each factor in the product 

A=A[-A [ A 2'^2 A 3 A m ^A m . 


has the form e Xi . □ 

Exercises 

The corollary brings to mind the element ( J, ) of SL(2, C), shown not to be of 
the form e x for A E 7\(SL(2,C)) in Exercise 5.6.5. 

8 . 6.1 Write (^q 1 _j ) as the product of two matrices in SL(2,C) with entries 0, 
or —i. 

8 . 6.2 Deduce from Exercise 8.6.1 and Exercise 5.6.4 that ( () ' _j) = e X] e Xl for 
some X\ ,X 2 E 7i(SL(2,C)). 

8.7 Simple connectedness 

A space SA is called simply connected if it is path-connected and, for any 
two paths p and q in Sd from point A to point B, there is a deformation of 
p to q with endpoints fixed. A deformation (or homotopy) of a path p to 
path q is a continuous function of two real variables, 

d : [0, 1] x [0, 1] -»• SP 


such that 

d(0,t) = p(t) and d(l,t) = q(t). 

And the endpoints are fixed if 

d(s,0) = p(0) = q(0) and d(s, 1 ) = p{\) = q{\) for all 5 . 

Here one views the first variable as “time” and imagines a continuously 
moving curve that equals p at time 0 and q at time 1 . So d is a “deformation 
from curve p to curve < 7 .” 
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The restriction of d to the bottom edge of the square [0, 1] x [0, 1] is one 
path p , the restriction to the top edge is another path q, and the restriction 
to the various horizontal sections of the square is a “continuous series” 
of paths between p and q. Figure 8.3 shows several of these sections, in 
different shades of gray, and their images under some continuous map d. 
These are “snapshots” of the deformation, so to speak. 10 



Figure 8.3: Snapshots of a path deformation with endpoints fixed. 

Simple connectivity is easy to define, but is quite hard to demonstrate 
in all but the simplest case, which is that of M. k . If p and q are paths in 
VJ ( from A to B, then p and q may each be deformed into the line segment 
AB, and hence into each other. To deform p, say, one can move the point 
p(t) along the line segment from p[t) to the point (1 — t)A + tB , traveling 
a fraction s of the total distance along this line in time s. 

The next-simplest case, that of S k for k> 1 , includes the important Lie 
group SU(2) = Sp(l) — the 8 3 of unit quaternions. On the sphere there 
is not necessarily a unique “line segment” from p(t) to the point we may 
want to send it to, so the above argument for R. k does not work. One can 
project B k minus one point P onto M /£ , and then do the deformation in M. k , 
but projection requires a point P not in the image of p, and hence it fails 
when p is a space-filling curve. To overcome the difficulty one appeals to 
compactness, which makes it possible to show that any path may be divided 
into a finite number of “small” pieces, each of which may be deformed on 

10 Defining simple connectivity in terms of deformation of paths between any two points 
A and B is convenient for our purposes, but there is a common equivalent definition in terms 
of closed paths: S? is simply connected if every closed path may be deformed to a point. 
To see the equivalence, consider the closed path from A to B via p and back again via q. 
(Or, strictly speaking, via the “inverse of path q ” defined by the function q( 1 — t).) 
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the sphere to a “line segment” (a great circle arc). This clears space on the 
sphere that enables the projection method to work. For more details see the 
exercises below. 

Compactness is also important in proving that certain groups are not 
simply connected. The most important case is the circle S 1 = SO(2), which 
we now study in detail, because the idea of “lifting,” introduced here, will 
be important in Chapter 9. 

The circle and the line 

The function f(6) = (cos 6, sin 6) maps M onto the unit circle S 1 . It is 
called a covering of S 1 by M and the points 6 + 2nn G M are said to lie over 
the point (cos 6, sin 6) G S 1 . This map is far from being 1-to-l, because 
infinitely many points of M lie over each point of S 1 . For example, the 
points over (1,0) are the real numbers 2 nn for all integers n (Figure 8.4). 

—4k —2 n 0 2 n 4 n 


O’ o > 

Figure 8.4: The covering of the circle by the line. 

However, the restriction of / to any interval of M with length < 2 n 
is 1-to-l and continuous in both directions, so / may be called a local 
homeomorphism. Figure 8.4 shows an arc of S 1 (in gray) of length < 2 n 
and all the intervals of M mapped onto it by /. The restriction of / to any 
one of these gray intervals is a homeomorphism. 

The local homeomorphism property of / allows us to relate path defor- 
mations in S 1 to path deformations in M, which are more easily understood. 
The first step is the following theorem, relating paths in S 1 to paths in M by 
a process called lifting. 

Unique path lifting. Suppose that p is a path in S 1 with initial point P, 
and P is a point in M over Q. Then there is a unique path p in M such that 
p( 0) = P and f o p = p. We call p the lift of p with initial point P. 

Proof. The path p is a continuous function from [0, 1] into S 1 , and hence it 
is uniformly continuous by the theorem in Section 8.5. This means that we 
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can divide [0, 1] into a finite number of subintervals, say J'x , ^ 2 , • • • , in 
left-to-right order, each of which is mapped by p into an arc of S 1 of length 
< 2 n. We let p t be the restriction of p to and allow the term “path” to 
include all continuous functions on intervals of M. 

Then, since / has a continuous inverse on intervals of length < 2k: 

• There is a unique path p\ :J ? i — > M, with initial point P, such that 
fo pi = p\. Namely, pft) = f~ l (pi{t)), where / -1 is the inverse 
of / in the neighborhood of P. Let the final point of p\ be P\ . 

• Similarly, there is a unique path p 2 : J'i — »• M, with initial point P \ , 
such that f o p 2 = P 2 , and with final point A say. 

• And so on. 

The concatenation of these paths pj in M is the lift p of p with initial point 

P. □ 

There is a similar proof of “unique deformation lifting” that leads to 
the following result. Suppose p and q are paths from A to B in S 1 and p is 
deformable to q with endpoints fixed. Then the lift p of p with initial point 
A is deformable to the lift q of q with initial point A with endpoints fixed. 

Now we are finally ready to prove that S 1 is not simply connected. In 
particular - , we can prove that the upper semicircle path 

p(t) = (cos7Tf,sin7rt) from (1,0) to (—1,0) 

is not deformable to the lower semicircle path 

q(t ) = (cos(— 7T?),sin(— nt)) from (1,0) to (—1,0). 

This is because the lift p of p with initial point 0 has final point n, whereas 
the lift q of q with initial point 0 has final point — n. Hence there is no de- 
formation of p to q with the endpoints fixed, and therefore no deformation 
of p to q with endpoints fixed. □ 

Exercises 

To see why the spheres S k with k> 1 are simply connected, first consider the 
ordinary sphere S 2 . 

8.7.1 Explain why, in a sufficiently small region of S 2 , there is a unique “line” 
between any two points. 
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8.7.2 Use the uniqueness of “lines” in a small region Si to define a deformation 
of any curve p from A to B in Si to the “line” from A to B. 

Exercises 8.7.1 and 8.7.2, together with uniform continuity, allow any curve p on 
S 2 to be deformed into a “spherical polygon,” which can then be projected onto a 
curve on the plane. 

It is geometrically obvious that there is a homeomorphism from S 2 — {P\ 
onto R 2 for any point P £ S 2 . Namely, choose coordinates so that P is the north 
pole (0,0, 1) and map 8 2 — {P} onto R 2 by stereographic projection , as shown in 
Figure 8.5. 


P 



Figure 8.5: Stereographic projection. 


To generalize this idea to any S l; we have to describe stereographic projection 
algebraically. So consider the S k in R 2 '* 1 , defined by the equation 

l-x 2 +1 = 1. 

We project S A stereographically from the “north pole” P = (0,0, ... ,0, 1) onto the 
subspace with equation x^+i = 0. 

8.7.3 Verify that the line through P and any other point (a i , a 2 , ■ ■ ■ , a k i- 1 ) £ has 
parametric equations 

xi = ait, x 2 = a 2 t , ..., x k = a k t, x k+ \ = 1 + (a k +i - l)f. 


8 . 7.4 Show that the line in Exercise 8.7.3 meets the hyperplane x k+ \ = 0 where 


xi = 


a l 


x 2 = 


ai 


1 — fljt+r ” 1— «r+i ’ ’ “ 1— a k+ \ 

8 . 7.5 By solving the equations in Exercise 8.7.4, or otherwise, show that 

2xi 2 x£ 


x k = 


a k 


a i 


i hx^ + 1 


1 * * ’ 5 


a k = 


x? + ■■■+*? + 1’ 


and 


a, t+i = — 


x? H 1- x 2 - 1 


xj-\ — f-x 2 +r 

Hence conclude that stereographic projection is a homeomorphism. 
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8.8 Discussion 

Closed, connected sets can be extremely pathological, even in M 2 . For 
example, consider the set called the Sierpinski carpet, which consists of 
the unit square with infinitely many open squares removed. Figure 8.6 
shows what it looks like after several stages of construction. The original 
unit square was black, and the white “holes” are where squares have been 
removed. In reality, the total area of removed squares is 1 , so the carpet is 
“almost all holes.” Nevertheless, it is a closed, path-connected set. 



Figure 8.6: The Sierpinski carpet 

Remarkably, imposing the condition that the closed set be a continu- 
ous group removes any possibility of pathology, at least in the spaces of 
n x n matrices. As von Neumann [1929] showed, a closed subgroup G of 
GL(/i. C) has a neighborhood of 1 that can be mapped to a neighborhood 
of 0 in some Euclidean space by the logarithm function, so G is certainly 
not full of holes. Also G is smooth, in the sense of having a tangent space 
at each point. 

Thus in the world of matrix groups it is possible to avoid the technical- 
ities of smooth manifolds and work with the easier concepts of open sets, 
closed sets, and continuous functions. 

In this book we avoid the concept of smooth manifold; indeed, this is 
one of the great advantages of restricting attention to matrix Lie groups. 
But we have, of course, investigated “smoothness” as manifested by the 
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existence of a tangent space at the identity (and hence at every point) for 
each matrix Lie group G. As we saw in Chapter 7, every matrix Lie group 
G has a tangent space T\(G) at the identity, and T\(G) equals some M. k . 
Even finite groups, such as G = {1}, have a tangent space at the identity; 
not surprisingly it is the space R°. 

Topology gives a way to describe all the matrix Lie groups with zero 
tangent space: they are the discrete groups, where a group H is called 
discrete if there is a neighborhood of 1 not containing any elements of G 
except 1 itself. Every finite group is obviously discrete, but there are also 
infinite discrete groups; for example, Z is a discrete subgroup of R. The 
groups Z and R can be viewed as matrix groups by associating each x S R 
with the matrix (J|) (because multiplying two such matrices results in 
addition of their x entries). 

It follows immediately from the definition of discreteness that T\ (H) = 
{0} for a discrete group H. It also follows that if H is a discrete subgroup 
of a matrix Lie group G then G/H is “locally isomorphic” to G in some 
neighborhood of 1. This is because every element of G in some neighbor- 
hood of 1 belongs to a different coset. From this we conclude that G/H 
and G have the same tangent space at 1 , and hence the same Lie alge- 
bra. This result shows, once again, why Lie algebras are simpler than Lie 
groups — they do not “see” discrete subgroups. 

Apart from the existence of a tangent space, there is an algebraic reason 
for including the discrete matrix groups among the matrix Lie groups: they 
occur as kernels of “Lie homomorphisms.” Since everything in Lie theory 
is supposed to be smooth, the only homomorphisms between Lie groups 
that belong to Lie theory are the smooth ones. We will not attempt a general 
definition of smooth homomorphism here, but merely give an example: the 
map <1> : R — > S 1 defined by 


O(0) =e w . 

This is surely a smooth map because d> is a differentiable function of 6. 
The kernel of this 0 is the discrete subgroup of R (isomorphic to Z) con- 
sisting of the integer multiples of 2k. We would like any natural aspect of a 
Lie “thing” to be another Lie “thing,” so the kernel of a smooth homomor- 
phism ought to be a Lie group. This is an algebraic reason for considering 
the discrete group Z to be a Lie group. 

The concepts of compactness, path-connectedness, simple connected- 
ness, and coverings play a fundamental role in topology, as a glance at any 
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topology book will show. Their role in Lie theory is also fundamental, and 
in fact Lie theory provides some of the best illustrations of these concepts. 
The covering of S 1 by M is one, and we will see more in the next chapter. 

Closed paths in SO(3) 

The group SO(3) of rotations of M 3 is a striking example of a matrix Lie 
group that is not simply connected. We exhibit a closed path in SO(3) that 
cannot be deformed to a point in an informal demonstration known as the 
“plate trick.” 

Imagine carrying a plate of soup in one hand, keeping the plate hori- 
zontal to avoid spilling. Now rotate the plate through 360° , returning it to 
its original position in space (first three pictures in Figure 8.7). 




Figure 8.7: The plate trick. 


The series of positions of the plate up to this stage may be regarded as 
a continuous path in SO(3). This is because each position is determined by 
an “axis” in M 3 (the vector from the shoulder to the hand) and an “angle” 
(the angle through which the plate has turned). This path in SO(3) is closed 
because the initial and final points are the same (axis, angle) pair. We can 
“deform” the path by varying the position of the arm and hand between 
the initial and final positions. But it seems intuitively clear that we cannot 
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deform the path to a single point — because the path creates a full twist in 
the arm, which cannot be removed by varying the path between the initial 
and final positions. 

However, traversing (a deformation of) the path again, as shown in 
the last three pictures, returns the arm and hand to their initial untwisted 
state! The topological meaning of this trick is that there is a closed path 
p in SO(3) that cannot be deformed to a point, whereas p 2 (the result of 
traversing p twice) can be deformed to a point. This topological property, 
appropriately called torsion, is actually characteristic of projective spaces, 
of which SO(3) is one. As we saw in Sections 2.2 and 2.3, SO(3) is the 
same as the real projective space RP 3 . 


9 

Simply connected Lie groups 


Preview 

Throughout our exposition of Lie algebras we have claimed that the struc- 
ture of the Lie algebra g of a Lie group G captures most, if not all, of the 
structure of G. Now it is time to explain what, if anything, is lost when we 
pass from G to g. The short answer is that topological information is lost, 
because the tangent space g cannot reveal how G may “wrap around” far 
from the identity element. 

The loss of information is already apparent in the case of M, 0(2), and 
S0(2), all of which have the line as tangent space. A more interesting case 
is that of 0(3), S0(3), and SU(2), all of which have the Lie algebra so(3). 
These three groups are not isomorphic, and the differences between them 
are best expressed in topological language, because the differences persist 
even if we distort 0(3), S0(3), and SU(2) by continuous 1-to-l maps. 

First, 0(3) differs topologically from S0(3) and SU(2) because it is 
not path-connected', there are two points in 0(3) not connected by a path 
in 0(3). Second, SU(2) differs topologically from S0(3) in being simply 
connected ; that is, any closed path in SU(2) can be shrunk to a point. 

We elaborate on these properties of 0(3), S0(3), and SU(2) in Sec- 
tions 9.1 and 9.2. Then we turn to the relationship between homomor- 
phisms of Lie groups and homomorphisms of Lie algebras: a Lie group ho- 
momorphism O :G—'H “induces” a Lie algebra homomorphism (p : () 

and if G and H are simply connected then ip uniquely determines <f>. This 
leads to a definitive result on the extent to which a Lie algebra g “deter- 
mines” its Lie group G: all simply connected groups with the same Lie 
algebra are isomorphic. 
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9.1 Three groups with tangent space R 

The groups 0(2) and S0(2) have the same tangent space, namely the tan- 
gent line at the identity in S0(2), because the elements of 0(2) not in 
S0(2) are far from the identity and hence have no influence on the tangent 
space. Figure 9. 1 gives a geometric view of the situation. 

The group S0(2) is shown as a circle, because S0(2) can be modeled 
by the circle {z : |z| = 1} in the plane of complex numbers. Its complement 
0(2) — S0(2) is the coset R ■ S0(2), where R is any reflection of the plane 
in a line through the origin. We can also view R ■ S0(2) as a circle (lying 
somewhere in the space of 2 x 2 real matrices), since multiplication by R 
produces a continuous 1 -to- 1 image of S0(2). The circle 0(2) — S0(2) is 
disjoint from S0(2) because distinct cosets are always disjoint. In partic- 
ular, 0(2) — S0(2) does not include the identity, so the tangent to 0(2) at 
the identity is simply the tangent to S0(2) at 1: 

7j(0(2)) = 7i(S0(2)). 


0(2) — S0(2) 




Figure 9.1: Tangent space of both S0(2) and 0(2). 

As a vector space, the tangent has the same structure as the real line 
M (addition of tangent vectors is addition of numbers, and scalar multiples 
are real multiples). The tangent also has a Lie bracket operation, but not an 
interesting one, because XY = YX for X,Y £ M, so 

[X,Y}=XY -YX = 0 for all X,Y £ R. 

Another Lie group with the same trivial Lie algebra is R itself (under the 
addition operation). It is clear that R is its own tangent space. 
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Thus we have three Lie groups with the same Lie algebra: 0(2), S0(2), 
and M. These groups can be distinguished algebraically in various ways 
(exercises), but the most obvious differences between them are topological: 

• 0(2) is not path-connected. 

• S0(2) is path-connected but not simply connected, that is, there is a 
closed path in S0(2) that cannot be continuously shrunk to a point. 

• M is path-connected and simply connected. 

Another difference is that both 0(2) and S0(2) are compact , that is, closed 
and bounded, and M is not. 

As this chapter unfolds, we will see that the properties of compactness, 
path-connectedness, and simple connectedness are crucial for distinguish- 
ing between Lie groups with the same Lie algebra. These properties are 
“squeezed out” of the Lie group G when we form its Lie algebra g. and 
we need to put them back in order to “reconstitute” G from g. In partic- 
ular, we will see in Section 9.6 that G can be reconstituted uniquely from 
g if we know that G is simply connected. But before looking at simple 
connectedness more closely, we study another example. 

Exercises 

9 . 1.1 Find algebraic properties showing that the groups 0(2), SO(2), and R are 
not isomorphic. 

From the circle group S 1 = SO(2) and the line group R we can construct three 
two-dimensional groups as Cartesian products: S 1 x S 1 , S 1 x R, and R x R. 

9 . 1.2 Explain why it is appropriate to call these groups the toms, cylinder, and 
plane, respectively. 

9 . 1.3 Show that the three groups have the same Lie algebra. Describe its under- 
lying vector space and Lie bracket operation. 

9 . 1.4 Distinguish the three groups algebraically and topologically. 

9.2 Three groups with the cross-product Lie algebra 

At various points in this book we have met the groups 0(3), S0(3), and 
SU(2), and observed that they all have the same Lie algebra: M 3 with the 
cross product operation. Their Lie algebra may also be viewed as the space 
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Mi+Mj + Mk of pure imaginary quaternions, with the Lie bracket operation 
defined in terms of the quaternion product by 

[X,Y] =XY-YX. 

The groups 0(3) and S0(3) differ in the same manner as 0(2) and S0(2), 
namely, S0(3) is path-connected and 0(3) is not. In fact, S0(3) is the 
connected component of the identity in 0(3): the subset of 0(3) whose 
members are connected to the identity by paths. 

Thus 0(3) and S0(3) (like 0(2) and S0(2)) have the same tangent 
space at the identity simply because all members of 0(3) near the identity 
are members of S0(3). The reason that S0(3) and SU(2) have the same 
tangent space is more subtle, and it involves a phenomenon not observed 
among the one-dimensional groups 0(2) and S0(2): the covering of one 
compact group by another. 

As we saw in Section 2.3, each element of S0(3) (a rotation of M 3 ) 
corresponds to an antipodal point pair ±q of unit quaternions. If we rep- 
resent q and — q by 2 x 2 complex matrices, they are elements of SU(2). 
It follows, as we observed in Section 6.1, that S0(3) and SU(2) have the 
same tangent vectors at the identity. However, the 2-to-l map of SU(2) 
onto S0(3) that sends the two antipodal quaternions q and —q to the single 
pair ±q creates a topological difference between SU(2) and S0(3). 

The group SU(2) is the 3-sphere 8 3 of quaternions q at unit distance 
from O in HI = M 4 , and the 3-sphere is simply connected. To see why, 
suppose p is a closed path in M 3 and suppose that A is a point of § 3 not on 
p. There is a continuous 1-to-l map of § 3 — {A} onto M 3 with a continuous 
inverse, namely stereographic projection n (see the exercises in Section 
8.7). It is clear that the loop n(p) can be continuously shrunk to a point 
in M 3 , for example, by magnifying its size by 1 — t at time t for 0 < t < 1. 
Hence the same is true of p by mapping the shrinking process back into § 3 

by n -1 . 

In contrast, the space S0(3) of antipodal point pairs ±q, for q e § 3 , 
is not simply connected. An informal explanation of this property is the 
“plate trick” described in Section 8.8. More formally, consider a path p(s) 
in § 3 that begins at 1 and ends at — 1 , that is, p{ 0 ) = 1 and p{ 1) = — 1 . 
Then the point pairs ±p(s) for 0 < s < I form a closed path p in S0(3) 
because ±;5(0) and ±/5(l) are the same point pair ± 1 . Now, if p can be 
continuously shrunk to a point, then p can be shrunk to a point keeping the 
initial point ±1 fixed (consider the shrinking process relative to this point). 
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It follows (by “deformation lifting” as in Section 8.7) that the correspond- 
ing curve p on § 3 can be shrunk to a point, keeping its endpoints 1 and — 1 
fixed. But this is absurd, because the latter are two distinct points. 

To sum up, we have: 

• The three compact groups 0(3), SO(3), and SU(2) have the same 
Lie algebra. 

• SO(3) and SU(2) are connected but 0(3) is not. 

• SU(2) is simply connected but SO(3) is not. 

The space SU(2) is said to be a double-covering of SO(3) because there 
is a continuous 2-to-l map of SU(2) onto SO(3) that is locally 1-to-l, 
namely the map q i— > { } . This map is locally 1-to-l because the only 
point, other than q, that goes to {±q} is the point —q, and a sufficiently 
small neighborhood of q does not include —q. Thus the quaternions q' in a 
sufficiently small neighborhood of q in SU(2) correspond 1-to-l with the 
pairs {±</} in a neighborhood of {±9} in SO(3). 

It turns out that all the groups SU(n) and Sp(n) are simply connected, 
and all the groups SO (n) for 11 > 3 are doubly covered by simply connected 
groups. Thus simply connected groups arise naturally from the classical 
groups. They are the “topologically simplest” among the groups with a 
given Lie algebra. The other thing to understand is the relationship between 
Lie group homomorphisms (such as the 2-to-l map of SU(2) onto SO(3) 
just mentioned) and Lie algebra homomorphisms. This is the subject of the 
next section. 

Exercises 

A more easily visualized example of a non-simply-connected space with simply 
connected double cover is the real projective plane KP 2 , which consists of the 
antipodal point pairs ±P on the ordinary sphere S 2 . Consider the path p on S 2 
that goes halfway around the equator, from a point Q to its antipodal point —Q. 

9.2.1 Explain why the corresponding path ±p on MP 2 , consisting of the point 
pairs ±P for P £ p, is a closed path on KP 2 . 

9.2.2 Suppose that ±p can be deformed on RP 2 to a single point. Draw a picture 
that illustrates the effect of a small “deformation” of ±p on the correspond- 
ing set of points on 8 2 . 

9.2.3 Explain why a deformation of ±p on RP 2 to a single point implies a defor- 
mation of p to a pair of antipodal points on § 2 , which is impossible. 
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9.3 Lie homomorphisms 

In Section 2.2 we defined a group homomorphism to be a map 0 : G — > H 
such that 0(gig2) = 0(gi)0(g2) f° r gu 82 G G. In the case of Lie 
groups G and H , where the group operation is smooth, it is appropriate that 
O preserve smoothness as well, so we define a Lie group homomorphism 
to be a smooth map 0 : G — * H such that 0(gig2) = 0(gi)0(g2) for all 
gi,g2 e G. 

Now suppose that G and H are matrix Lie groups, with Lie algebras 
(tangent spaces at the identity) T\{G) = 0 and T\(H) = fj, respectively. Our 
fundamental theorem says that a Lie group homomorphism 0 : G — > H 
“induces” (in a sense made clear in the statement of the theorem) a Lie 
algebra homomorphism <p : g — > f), that is, a linear map that preserves the 
Lie bracket. 

The induced map <p is the “obvious” one that associates the initial ve- 
locity A'(0) of a smooth path Ait) through 1 in G with the initial velocity 
(0oA)'(O) of the image path 0(A(f)) in H. It is not completely obvious 
that this map is well-defined; that is, it is not clear that if A(0) = 5(0) = 1 
and A 7 (0) = B'( 0) then (0oA)'(O) = (0oB)'(O). But we can sidestep this 
problem by defining a smooth map 0 : G — > H to be one for which the 
correspondence A ; (0) i— > (0oA) ; (O) is a well-defined and linear map from 
T\{G) to T\ (H). 

Then it remains only to prove that (p preserves the Lie bracket, and we 
have already done most of this in proving the Lie algebra properties of the 
tangent space in Section 5.4. 

For the sake of brevity, we will use the term “Lie homomorphism” for 
both Lie group homomorphisms and Lie algebra homomorphisms. 

The induced homomorphism. For any Lie homomorphism 0 : G — ^ H of 
matrix Lie groups G, H, with Lie algebras g, t), respectively, there is a Lie 
homomorphism rp : g — > L such that 

<P(A'(O)) = (0oA)'(O) 
for any smooth path A{t ) through 1 in G. 

Proof. Thanks to our definition of a smooth map 0, it remains only to 
prove that <p preserves the Lie bracket, that is, 


(p[A'(0),5'(0)] = [(p(A'(0)),(p(B'(0))] 
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for any smooth paths A(f), B(t) in G with A(0) = f?(0) = 1. 

We do this, as in Section 5.4, by considering the smooth path in G 

C s (t ) = A(s)B(t)A(s)~ l for a fixed value of 

The O-image of this path in H is 

0(C,(t)) = ch (A(s)B(t)A(s)~ 1 ) = 0(A(s)) • <t>(fi(f)) • 

because O is a group homomorphism. As we calculated in Section 5.4, 

C' s (0)=A(s)B'(0)A(sr 1 €Q, 


so 


<p(c' s m 


- O(A(,))-O(S(0)-O(A(,))- 1 

al r=0 

(<DoA)(i) • (<DoS)'(0) • (OoA)^)” 1 6 i). 


As s varies, C'(0) traverses a smooth path in g and <p(C'(0)) traverses 
a smooth path in t). Therefore, by the linearity of <p, 

<p (tangent to C'( 0) at 5 = 0) = (tangent to <p(C(.( 0)) at s = 0) . (*) 

Now we know from Section 5.4 that the tangent to C((0) at s = 0 is 

A , (0)B , (0) — S'(0)A'(0) = [A' (0), £'(())]. 

A similar calculation shows that the tangent to <p(C(,( 0)) at s = 0 is 

(OoA)'(O) • (OoS)'(O) - (Oog) / (0)-(choA) / (0) 

= [(OoA)'(0),(Oosy(0)] 

= [9(A'(0)), 9(^(0))]. 


So it follows from (*) that 

<p[A'(0),S'(0)] = [<p(A'(0)),<p(S'(0))], 

as required. □ 

If : G — > H is a Lie isomorphism, then d> 1 : H — * G is also a Lie 
isomorphism, and it maps any smooth path through 1 in H back to a smooth 
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path through 1 in G. Thus <1> maps the smooth paths through 1 in G onto 
all the smooth paths through 1 in //, and hence (p is onto ()■ 

It follows that the Lie homomorphism (p 1 induced by d> 1 is from b 
into g. And since (p ' sends (0oA)'(O) in b to (O 1 oOoA)'(O), that is, to 
A'{0), we have (p' = (p . In other words, (p is an isomorphism of g onto 
b, and so isomorphic Lie groups have isomorphic Lie algebras. 

The converse statement is not true, but it is “nearly” true. In Section 
9.6 we will show that groups G and H with isomorphic Lie algebras are 
themselves isomorphic if they are simply connected. The proof uses paths 
in G to “lift” a homomorphism from g in “small steps.” This necessitates 
further study of paths and their compactness, which we carry out in the 
next two sections. 

The trace homomorphism revisited 

In Sections 6.2 and 6.3 we have already observed that the map 

Tr : g — C 

of a real or complex Lie algebra g is a Lie algebra homomorphism. This 
result also follows from the theorem above, because the trace is the Lie 
algebra map induced by the det homomorphism for real or complex Lie 
groups (Section 2.2) thanks to the formula 

det(c A ) = e Tli - A) 

of Section 5.3. 

Exercises 

Defining a smooth map to be one that induces a linear map of the tangent space, so 
that we don’t have to prove this fact, is an example of what Bertrand Russell called 
“the advantages of theft over honest toil” (in his Introduction to Mathematical 
Philosophy, Routledge 1919, p. 71). We may one day have to pay for it by having 
to prove that some “obviously smooth” map really is smooth by showing that it 
really does induce a linear map of the tangent space. 

I made the definition of smooth map ® : G — > H mainly to avoid proving that 
the map <p : A'(0) i— > (®o A)'(O) is well-defined. (That is, if A'(0) = B'{ 0) then 
(OoA)'(O) = (<f>oB)'(0).) If we assume that tp is well-defined, then, to prove that 
<p is linear, we need only assume that <f> maps smooth paths to smooth paths. The 
proof goes as follows. 

Consider the path C(f) =A(t)B(t), where A(f) and B(t) are smooth paths with 
A(0) = B( 0) = 1. Then we know from Section 5.4 that C'(0) = A'(0) + B'(0). 
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9 . 3.1 Using the fact that O is a group homomorphism, show that we also have 

(OoC)'(O) = (OoA)'(O) + (® ofl)'(O). 

9 . 3.2 Deduce from Exercise 9.3.1 that (p(A'( 0) +B'( 0)) = <p(A'(0)) + (p(B'( 0)). 

9 . 3.3 Let D(t) = Aft) for some real number r. Show that />'('()) = rA'(O) and 
(OoD)'(O) = r(OoA)'(0). 

9 . 3.4 Deduce from Exercises 9.3.2 and 9.3.3 that (p is linear. 

9.4 Uniform continuity of paths and deformations 

The existence of space-filling curves shows that a continuous image of the 
unit interval [0, 1] may be very “tangled.” Indeed, the image of an arbitrar- 
ily short subinterval may till a whole square in the plane. Nevertheless, the 
compactness of [0, 1] ensures that the images of small segments of [0, 1] are 
“uniformly” small. This is formalized by the following theorem, an easy 
consequence of the uniform continuity of continuous functions on compact 
sets from Section 8.5. 

Uniform continuity of paths. If p : [0, 1] — > M" is a path, then, for any 
£ > 0, it is possible to divide [0,1] into a finite number of subintervals, 
each of which is mapped by p into an open ball of radius £. 

Proof. The interval [0, 1] is compact, by the Heine-Borel theorem of Sec- 
tion 8.4, so p is uniformly continuous by the theorem of Section 8.5. In 
other words, for each e > 0 there is a 8 > 0 such that \p{Q) — /?(/?) | < £ 
for any points Q,R £ [0, 1] such that \Q — < 8. 

Now divide [0, 1] into subintervals of length < S and pick a point Q in 
each subinterval (say, the midpoint). Each subinterval is mapped by p into 
the open ball with center p{Q) and radius e because, if R is in the same 
subinterval as Q, we have \Q — < 8, and hence \p{Q) — p(R)\ < £. □ 

The same proof applies in two dimensions, almost word for word. 

Uniform continuity of path deformations. If d : [0, 1] x [0, 1] — > M' ! is a 

path deformation, then, for any £ > 0, it is possible to divide the square 
[0,1] X [0, 1] into a finite number of sub squares, each of which is mapped 
by d into an open ball of radius £. 

Proof. The square [0, 1] x [0, 1] is compact, by the generalized Heine- 
Borel theorem of Section 8.4, so d is uniformly continuous by the theorem 
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of Section 8.5. In other words, for each e > 0 there is a 8 > 0 such that 
| p(Q) — p(R)\ < £ for any points Q,R£ [0, 1] x [0, 1] such that \Q — R\ <8. 

Now divide [0, 1] x [0, 1] into subsquares of diagonal < 8 and pick a 
point Q in each subsquare (say, the center). Each subsquare is mapped by 
d into the open ball with center p(Q) and radius £ because, if R is in the 
same subsquare as Q, we have \Q — /?| < 8 and hence \p{Q) — p(/?)| < £. 
□ 

Exercises 

9 . 4.1 Show that the function f{x) = 1 jx is continuous, but not uniformly contin- 
uous, on the open interval (0,1). 

9 . 4.2 Give an example of continuous function that is not uniformly continuous 
on GL(2,C). 

9.5 Deforming a path in a sequence of small steps 

The proof of uniform continuity of path deformations assumes only that d 
is a continuous map of the square into M". We now need to recall how such 
a map is interpreted as a “path deformation.” The restriction of d to the 
bottom edge of the square is one path p, the restriction to the top edge is 
another path q, and the restriction to the various horizontal sections of the 
square is a “continuous series” of paths between p and q — a deformation 
from p to q. Figure 9.2 shows the “deformation snapshots” of Figure 8.3 
further subdivided by vertical sections of the square, thus subdividing the 
square into small squares that are mapped to “deformed squares” by d. 



Im(?) 



Figure 9.2: Snapshots of a path deformation. 


196 


9 Simply connected Lie groups 


The subdivision of the square into small subsquares is done with the 
following idea in mind: 

• By making the subsquares sufficiently small we can ensure that their 
images lie in e-balls of M" for any prescribed e. 

• The bottom edge of the unit square can be deformed to the top 
edge by a finite sequence of deformations d\j, each of which is the 
identity map of the unit square outside a neighborhood of the 
subsquare. 

• It follows that if p can be deformed to q then the deformation can 
be divided into a finite sequence of steps. Each step changes the 
image only in a neighborhood of a “deformed square,” and hence in 
an £-ball. 

To make this argument more precise, though without defining the d\j in 
tedious detail, we suppose the effect of a typical d\j on the (i,j ) -subsquare 
to be shown by the snapshots shown in Figure 9.3. In this case, the bottom 
and right edges are pulled to the position of the left and top edges, respec- 
tively, by “stretching” in a neighborhood of the bottom and right edges and 
“compressing” in a neighborhood of the left and top. This deformation 
will necessarily move some points in the neighboring subsquares (where 
such subsquares exist), but we can make the affected region outside the 
(i,j ) -subsquare as small as we please. Thus d,j is the identity outside a 
neighborhood of, and arbitrarily close to, the (z.y)-subsquare. 



Figure 9.3: Deformation dij of the (/, /)-subsquarc. 

Now, if the (1, 1) -subsquare is the one on the bottom left and there are 
n subsquares in each row, we can move the bottom edge to the top through 

the sequence of deformations dn,dn,- ■ -,di n ,d 2 n ,- ■■ ,^ 21 ,^ 31 , Figure 

9.4 shows the first few steps in this process when n = 4. 

Since each dij is a map of the unit square into itself, equal to the identity 
outside a neighborhood of an (z,y')-subsquare, the composite map d o d if 
{“dij then d”) agrees with d everywhere except on a neighborhood of the 
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Figure 9.4: Sequence deforming the bottom edge to the top. 


image of the (/'. /)-subsquarc. Intuitively speaking, d o d;j moves one side 
of the image subsquare to the other, while keeping the image fixed outside 
a neighborhood of the image subsquare. 

It follows that if d is a deformation of path p to path q and dq runs 
through the sequence of maps that deform the bottom edge of the unit 
square to the top , then the sequence of composite maps d o djj deforms 
p to q, and each do djj agrees with d outside a neighborhood of the image 
of the ( i, j)- sub square , and hence outside an £-ball. 

In this sense, if a path p can be deformed to a path q, then p can be 
deformed to q in a finite sequence of “small’’ steps. 


Exercises 

9.5.1 If a < 0 < 1 < b, give a continuous map of ( a,b ) onto (a,b) that sends 0 to 
1. Use this map to define dq when the (/, y)-subsquare is in the interior of 
the unit square. 

9.5.2 If 1 < b give a continuous map of [0, £>) onto [l,fe) that sends 0 to 1, and 
use it (and perhaps also the map in Exercise 9.5.1) to define dq when the 
(i.yj-suhsquare is one of the boundary squares of the unit square. 


9.6 Lifting a Lie algebra homomorphism 

Now we are ready to achieve the main goal of this chapter: showing that 
if g and b are the Lie algebras of simply connected Lie groups G and H, 
respectively, then each Lie algebra homomorphism (p : g — > b is induced by 
a Lie group homomorphism <4> :G—>H. This is the converse of the theorem 
in Section 9.3, and the two theorems together show that the structure of 
simply connected Lie groups is completely captured by their Lie algebras. 
The idea of the proof is to “lift” the homomorphism (p from g to G in small 
pieces, with the help of the exponential function and the Campbell-Baker- 
Hausdorff theorem of Section 7.7. 
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We already know, from Section 7.4, that there is a neighborhood 
of 1 in G that is the 1-to-l image of a neighborhood of 0 in g under the 
exponential function. We also know, by Campbell-Baker-Hausdorff, that 
the product of two elements , e ¥ S G is given by a formula 

gX gY gX+y+g[X,y]+further Lie bracket terms 

Therefore, if we define O on each element e x of G by ( -\>{e x ) = e (piX ' , then 

Q^gXgY'j (jj ^^X+y+l[X,y]+further Lie bracket terms^ 

_ ^<p(X+y+l[X,y]+further Lie bracket terms) 

_ (<p(X)+(p(y)+l[(p(X),<p(y)]+further Lie bracket terms) 

because <p is a Lie algebra homomorphism 
= e ,p(x >e ,p(Y > by Campbell-Baker-Hausdorff 
= <P(e x )®(e Y ). 

Thus <l> is a Lie group homomoiphism, at least in the region ,/L where 
every element of G is of the form e x . However, not all elements of G are 
necessarily of this form, so we need to extend O to an arbitrary A £ G by 
some other means. This is where we need the simple connectedness of G, 
and we carry out a four-stage process, explained in detail below. 

1 . Connect A to 1 by a path, and show that there is a sequence of points 

1 — A ] . A 2 , A m — A 

along the path such that Ai,A ] _1 A 2 , . . . ,A m 1 _ 1 A m all lie in jV , and 
hence such that all of 0 (Ai), 0 (A 1 _1 A 2 ), . . . ,<E>(A~ij A,„) are defined. 
Motivated by the fact that A = A 1 -A^Ai A^^A,,,, we let 

0(A) = 0(A 1 )0(A r 1 A 2 )---0(A-I 1 A m ). 

2. Show that 0(A) does not change when the sequence A\ ,Ai, ■ ■ ■ ,A m is 
“refined” by inserting an extra point. Since any two sequences have 
a common refinement, obtained by inserting extra points, the value 
of O(A) is independent of the sequence of points along the path. 

3. Show that O(A) is also independent of the path from 1 to A, by show- 
ing that O(A) does not change under a small deformation of the path. 
(Simple connectedness of G ensures that any two paths from 1 to A 
may be made to coincide by a sequence of small deformations.) 
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4. Check that is a group homomorphism, and that it induces the Lie 
algebra homomorphism <p . 

Stage 1. Finding a sequence of points 1 = Ai,A 2 , ■ ■ ■ ,A m = A. 

In Section 8.6 we showed how to do this, under the title of “gen- 
erating a path-connected group from a neighborhood of 1.” We found 
1 = Ai,A 2 , ■ ■ ■ ,A m = A so that A,- and A,- + i lie in the same set A(tj)G, 
where G is an open subset of jV small enough that C ( ~ 1 C ; + 1 e ,/L for 
any Q,C i+ 1 G G. 

Then 0(Ai) = 0(1) is defined, and so is 0 (Ag 1 A (+ i) for each i. 

Stage 2. Independence of the sequence along the path. 

Suppose that A' is another point on the path A(t), in the same neigh- 
borhood A(tj)G as A, and A (+ i. When the sequence is refined from 

At, . • . ,A;,A,-|_i . . . ,A n to Ai, . . . ,A,-,A ; ,A,- + i . . . ,A m , 

the expression for 0(A) is changed by replacing the factor 0(AG*A/ + i ) by 
the two factors 0(A ; 'A')0(A' 'A, + i). Then both Aj l A\ and A( _1 A i+ i are 
in G, and so 

0(Ar 1 A;)0(Ar 1 A i - +1 ) = 0(Ar 1 A'A;." 1 A i - +1 ) 

because O is a homomorphism on G 
= 0(Ar 1 A ;+ i). 

Hence insertion of an extra point does not change the value of 0(A). 

Stage 3. Independence of the path. 

Given paths p and q from 1 to A, we know that p can be deformed to q 
because G is simply connected. Let d : [0, 1] x [0, 1] — » G be a deformation 
from p to q. Each point P in the unit square has a neighborhood 

N(P) = {Q:d(P)~ 1 d(Q)eGK}, 

which is open by the continuity of d and matrix multiplication. Inside N (P) 
we choose a square neighborhood S(P) with center P and sides parallel 
to the sides of the unit square. Then the unit square is contained in the 
union of these square neighborhoods, and hence in a finite union of them, 
S(P\) US(P 2 ) U • --Ll S(Pk), by compactness. 
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Let e be the minimum side length of the finitely many rectangular over- 
laps of the squares S(Pj) covering the unit square. Then, if we divide the 
unit square into equal subsquares of some width less than e, each subsquare 
lies in a square S(Pj). Therefore, for any two points P, Q in the subsquare, 
we have d(P)~ 1 d(Q ) G jY . 

This means that we can deform p to q by “steps” (as described in the 
previous section) within regions of G where the point d(P) inserted or re- 
moved in each step is such that d(P)~ 1 d(Q) G cY for its neighbor vertices 
d{Q) on the path, so < i>(d(P)^ 1 d(Q)) is defined. Consequently, O can be 
defined along the path obtained at each step of the deformation, and we can 
argue as in Stage 2 that the value of O does not change. 

Stage 4. Verification that <I> is a homomorphism that induces (p. 

Suppose that A , B G G and that 1 = Ai ,A 2 , . ■ ■ ,A m = A is a sequence of 
points such that AJ { Aj + \ G G for each i, so 

0(A)=0(A 1 )0(A- 1 A 2 )-.-0(A;I 1 A m ). 

Similarly, let 1 = Bi,B 2 , . . . ,B„ = B be a sequence of points such that 
Bj x Bj + 1 G G for each i, so 

O(fi) = O^O^ 1 ^) • • <$>(B-\B n ). 

Now notice that 1 = Aj,A 2 , . . . ,A m = ABj,AB 2 , . . . ,AB n is a sequence of 
points, leading from 1 to AB, such that any two adjacent points lie in a 
neighborhood of the form CG. Indeed, if the points B, and B, | both lie in 
CG then Afi, and AB 1+ i both lie in AC G. It follows that 

0(AB) = 0(Aj )0(Aj- : *A 2 ) • • • 0(A~i 1 A m ) 

x 0((AB 1 )~ 1 AB 2 )<t>((AB 2 )~ 1 AB 3 ) • • • d>((AB„_ 1 )” 1 AB„) 

= 0(A! )<D(Af *A 2 ) • • ■ d>(A-i 1 A m ) 

x 0(B7 1 B 2 )0(B2 ^3) • • ■ ®(B-\B n ) 

= 0(A)O(B) because <t>(Bi) = 0(1) = 1. 

Thus O is a homomorphism. 

To show that O induces (p it suffices to show this property on ,/L, be- 
cause we have shown that there is only one way to extend O beyond jY . 
On jY , O(A) = £ ,< /’( log ( / ')), so for the path e ,x through 1 in G we have 

0(c rx ) = y e'vW = «p(X). 

t=o t = 0 


d_ 

dt 
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Thus <t> induces the Lie algebra homomorphism <p. 

Putting these four stages together, we finally have the result: 

Homomorphisms of simply connected groups. If g and t) are the Lie 

algebras of the simply connected Lie groups G and H, respectively, and if 
(p : g — > t) is a homomorphism, then there is a homomorphism 0 : G — > H 
that induces (p. □ 

Corollary. If G and H are simply connected Lie groups with isomorphic 
Lie algebras g and f), respectively, then G is isomorphic to H. 

Proof. Suppose that <p : g — > t) is a Lie algebra isomorphism, and let the 
homomorphism that induces <p be 0 : G — > H. Also, let *0* : /Y — > G he the 
homomorphism that induces <p 1 . It suffices to show that 0 = 0 1 , since 
this implies that 0 is a Lie group isomorphism. 

Well, it follows from the definition of the “lifted” homomorphisms that 
0 o 0 : G G is the unique homomorphism that induces the identity map 
(p 1 o (p : g — > g. hence 0 o 0 is the identity map on G. In other words, 
0 = 0 *. □ 

9.7 Discussion 

The final results of this chapter, and many of the underlying ideas, are due 
to Schreier [1925] and Schreier [1927]. In the 1920s, understanding of 
the connections between group theory and topology grew rapidly, mainly 
under the influence of topologists, who were interested in discrete groups 
and covering spaces. Schreier was the first to see clearly that topology is 
important in Lie theory and that it separates Lie algebras from Lie groups. 
Lie algebras are topologically trivial but Lie groups are generally not, and 
Schreier introduced the concept of covering space to distinguish between 
Lie groups with the same Lie algebra. He pointed out that every Lie group 
G has a universal covering G — > G, the unique continuous local isomor- 
phism of a simply connected group onto G. Examples are the homomor- 
phisms M — >■ S 1 and SU(2) — * SO(3). In general, the universal covering is 
constructed by “lifting,” much as we did in the previous section. 

The universal covering construction is inverse to the construction of the 
quotient by a discrete group because the kernel of G — * G is a discrete sub- 
group of G, known to topologists as the fundamental group of G, n\ (G ) . 
Thus G is recovered from G as the quotient G/k\ (G) = G. Another im- 
portant result discovered by Schreier [1925] is that n\(G) is abelian for a 
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Lie group G. This result strongly constrains the topology of Lie groups, 
because the fundamental group of an arbitrary smooth manifold can be any 
finitely presented group. A “random” smooth manifold has a nonabelian 
fundamental group. 

Like the quotient construction (see Section 3.9), the universal cover- 
ing can produce a nonmatrix group G from a matrix group G. A famous 
example, essentially due to Cartan [1936], is the universal covering group 
SL(2,C) of the matrix group SL(2,C). Thus topology provides another 
path to the world of Lie groups beyond the matrix groups. 

Topology makes up the information lost when we pass from Lie groups 
to Lie algebras, and in fact topology makes it possible to bypass Lie alge- 
bras almost entirely. A notable book that conducts Lie theory at the group 
level is Adams [1969], by the topologist J. Frank Adams. It should be said, 
however, that Adams’s approach uses topology that is more sophisticated 
than the topology used in this chapter. 

Finite simple groups 

The classification of simple Lie groups by Killing and Cartan is a remark- 
able fact in itself, but even more remarkable is that it paves the way for the 
classification of finite simple groups — a much harder problem, but one that 
is related to the classification of continuous groups. Surprisingly, there are 
finite analogues of continuous groups in which the role of M or C is played 
by finite fields. 

As mentioned in Section 2.8, finite simple groups were discovered by 
Galois around 1830 as a key concept for understanding unsolvability in the 
theory of equations. Galois explained solution of equations by radicals as a 
process of “symmetry breaking” that begins with the group of all symme- 
tries of the roots and factors it into smaller groups by taking square roots, 
cube roots, and so on. The process first fails with the general quintic equa- 
tion, where the symmetry group is S 5 , the group of all 120 permutations of 
five things. The group S 5 may be factored down to the group A5 of the 60 
even permutations of five things by taking a suitable square root, but it is 
not possible to proceed further because A 5 is a simple group. 

More generally, A n is simple for n > 5, so Galois had in fact discovered 
an infinite family of finite simple groups. Apart from the infinite family of 

1 1 This brings to mind a quote attributed to Stan Ulam: The infinite we can do right away, 
the finite will take a little longer. 
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cyclic groups of prime order, the finite simple groups in the other infinite 
families are finite analogues of Lie groups. Each infinite matrix Lie group 
G spawns infinitely many finite groups, obtained by replacing the matrix 
entries in elements of G by entries from a finite field, such as the field of 
integers mod 2. There is a finite field of size q for each prime power q, so 
infinitely many finite groups correspond to each infinite matrix Lie group 
G. These are called the finite groups of Lie type. 

It turns out that each simple Lie group yields infinitely many finite sim- 
ple groups in this way. So, alongside the family of alternating groups, we 
have a family of simple groups of Lie type for each simple Lie group. The 
finite simple groups that fall outside these families are therefore even more 
exceptional than the exceptional Lie groups. They are called the sporadic 
groups, and there are 26 of them. The story of the sporadic simple groups 
is a long one, filled with so many amazing episodes that it is impossible 
to sketch it here. Instead, I recommend the book Ronan [2006] for an 
overview, and Thompson [1983] for a taste of the mathematics. 
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triangle inequality, 85 
two-square identity, 6 
complex, 11 

U(n), 48 

definition, 55 
is not simple, 107 
is path-connected, 69 

u(«) 

is not simple, 124 
Ulam, Stan, 202 
unitary group, 48, 55 

vector product see cross product 13 
vector space 

overC, 108, 124 
overM, 82, 103, 106, 107 
velocity vector, 79 
Viete, Francois, 18 
von Neumann, John, viii, 158 

and Hilbert’s fifth problem, 159 
and matrix exponentiation, 91 
concept of tangent, 1 14, 145 
theorem on exponentiation, 92 
theory of matrix Lie groups, 158 

Wedderburn, J. H. M„ 91 
Weyl, Hermann, 7 1 

introduced maximal tori, 72 
introduced word “symplectic”, 71 
The Classical Groups, 71, 113 
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Zippin, Leo, 159 
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Irving: Integers, Polynomials, and Rings: A Course 
in Algebra. 

Isaac: The Pleasures of Probability. 

Readings in Mathematics. 

James: Topological and Uniform Spaces. 

Janich: Linear Algebra. 

Janich: Topology. 

Janich: Vector Analysis. 

Kemeny/Snell: Finite Markov Chains. 

Kinsey: Topology of Surfaces. 

Klambauer: Aspects of Calculus. 

Knoebel, Laubenbacher, Lodder, Pengelley: 
Mathematical Masterpieces: Further Chronicles 
by the Explorers. 

Lang: A First Course in Calculus. Fifth edition. 

Lang: Calculus of Several Variables. Third edition. 

Lang: Introduction to Linear Algebra. Second 
edition. 

Lang: Linear Algebra. Third edition. 

Lang: Short Calculus: The Original Edition of 
“A First Course in Calculus.” 

Lang: Undergraduate Algebra. Third edition. 

Lang: Undergraduate Analysis. 

Laubenbacher/Pengelley: Mathematical 
Expeditions. 

Lax/Burstein/Lax: Calculus with Applications 
and Computing. Volume 1. 

LeCuyer: College Mathematics with APL. 

Lidl/Pilz: Applied Abstract Algebra. Second 
edition. 

Logan: Applied Partial Differential Equations, 
Second edition. 

Logan: A First Course in Differential Equations. 

Lovasz/Pelikan/Vesztergombi: Discrete 
Mathematics. 

Macki-Strauss: Introduction to Optimal 
Control Theory. 

Malitz: Introduction to Mathematical Logic. 

Marsden/Weinstein: Calculus I, II, III. Second 
edition. 

Martin: Counting: The Art of Enumerative 
Combinatorics. 

Martin: The Foundations of Geometry and the 
Non-Euclidean Plane. 

Martin: Geometric Constructions. 

Martin: Transformation Geometry: An 
Introduction to Symmetry. 

Millman/Parker: Geometry: A Metric 
Approach with Models. Second edition. 

Moschovakis: Notes on Set Theory. Second 
edition. 

Owen: A First Course in the Mathematical 
Foundations of Thermodynamics. 

Palka: An Introduction to Complex Function 
Theory. 

Pedrick: A First Course in Analysis. 

Peressini/Sullivan/Uhl: The Mathematics of 
Nonlinear Programming. 


Prenowitz/Jantosciak: Join Geometries. 

Priestley: Calculus: A Liberal Art. Second edition. 
Protter/Morrey: A First Course in Real Analysis. 
Second edition. 

Protter/Morrey: Intermediate Calculus. Second 
edition. 

Pugh: Real Mathematical Analysis. 

Roman: An Introduction to Coding and 
Information Theory. 

Roman: Introduction to the Mathematics of 
Finance: From Risk management to options 
Pricing. 

Ross: Differential Equations: An Introduction with 
Mathematical. Second Edition. 

Ross: Elementary Analysis: The Theory of 
Calculus. 

Samuel: Projective Geometry. 

Readings in Mathematics. 

Saxe: Beginning Functional Analysis 
Scharlau/Opolka: From Fermat to Minkowski. 
Schiff: The Laplace Transform: Theory and 
Applications. 

Sethuraman: Rings, Fields, and Vector Spaces: An 
Approach to Geometric Constructability. 

Shores: Applied Linear Algebra and Matrix 
Analysis. 

Sigler: Algebra. 

Silverman/Tate: Rational Points on Elliptic Curves. 
Simmonds: A Brief on Tensor Analysis. Second 
edition. 

Singer: Geometry: Plane and Fancy. 

Singer: Linearity, Symmetry, and Prediction in the 
Hydrogen Atom. 

Singer/Thorpe: Lecture Notes on Elementary 
Topology and Geometry. 

Smith: Linear Algebra. Third edition. 

Smith: Primer of Modern Analysis. Second edition. 
Stanton/White: Constructive Combinatorics. 
Stillwell: Elements of Algebra: Geometry, 

Numbers, Equations. 

Stillwell: Elements of Number Theory. 

Stillwell: The Four Pillars of Geometry. 

Stillwell: Mathematics and Its History. Second 
edition. 

Stillwell: Naive Lie Theory. 

Stillwell: Numbers and Geometry. 

Readings in Mathematics. 

Strayer: Linear Programming and Its Applications. 
Toth: Glimpses of Algebra and Geometry. Second 
Edition. 

Readings in Mathematics. 

Troutman: Variational Calculus and Optimal 
Control. Second edition. 

Valenza: Linear Algebra: An Introduction to 
Abstract Mathematics. 

Whyburn/Duda: Dynamic Topology. 

Wilson: Much Ado About Calculus. 


